TCP/IP offload device with reduced sequential processing

ABSTRACT

A TCP Offload Engine (TOE) device includes a state machine that performs TCP/IP protocol processing operations in parallel. In a first aspect, the state machine includes a first memory, a second memory, and combinatorial logic. The first memory stores and simultaneously outputs multiple TCP state variables. The second memory stores and simultaneously outputs multiple header values. In contrast to a sequential processor technique, the combinatorial logic generates a flush detect signal from the TCP state variables and header values without performing sequential processor instructions or sequential memory accesses. In a second aspect, a TOE includes a state machine that performs an update of multiple TCP state variables in a TCB buffer all simultaneously, thereby avoiding multiple sequential writes to the TCB buffer memory. In a third aspect, a TOE involves a state machine that sets up a DMA move in a single state machine clock cycle.

CROSS REFERENCE TO RELATED APPLICATIONS

The present application claims the benefit under 35 USC 120 of (is acontinuation-in-part of) application Ser. No. 10/729,111, which has thesame title and inventors, and which is incorporated by reference herein.

BACKGROUND INFORMATION

Computers communicate over the internet (and many other networks) usingthe TCP/IP protocol suite. Such a computer communicates by transmittinginformation in TCP/IP packets onto the network and by receivinginformation in TCP/IP packets from the network. The TCP and IP protocolsare, however, fairly complex. In a simple conventional computerarchitecture, the central processing unit (CPU) of the computer may haveto sacrifice a considerable amount of its processing power to performthe TCP/IP protocol processing necessary to allow the computer tocommunicate over the network. This reduces the amount of processingpower available to do the other tasks that are the principal functionsof the computer.

Devices called TCP Offload Engines (TOE) devices have therefore beendeveloped. In one definition, a TOE device is a device that performssome or all of the TCP and IP protocol processing for the computer suchthat the processing power of the computer's CPU can be devoted to othertasks. TOE devices are often realized on expansion cards called networkinterface cards (NIC) cards. A NIC card that includes a type of TOEdevice is sometimes called an Intelligent Network Interface Card (INIC).

U.S. Pat. No. 6,247,173 describes on example of a TOE. The TOE deviceincludes a processor as well as several other devices. The processor onthe TOE executes firmware instructions that are stored on the TOEdevice. As networking speeds have increased, so too have the processingdemands imposed on the processor of such a TOE. One way TOE processingpower has been increased is by increasing the clock rate of theprocessor. This increases the rate at which the processor fetches and/orexecutes instructions. There are, however, practical limits on how highthe processor clock rate can be increased. Advances in semiconductorprocessing technology over time have allowed ever increasing processorclock rates, but it is envisioned that the rate of increase will not beadequate to keep pace with the future demands on processing power due toeven more rapid increases in network speeds.

If TOE throughput cannot be sufficiently increased simply by increasingprocessor clock speeds, then other techniques will have to be employedif the desired increased throughput is to be achieved. One technique forincreasing throughput involves increasing the width of the processor'sdata bus and using a wider data bus and ALU. Although this mightincrease the rate at which certain TOE functions are performed, theexecution of other functions will still likely be undesirably slow dueto the sequential processing nature of the other TCP/IP offload tasks.Other computer architecture techniques that might be employed involveusing a multi-threaded processor and/or pipelining in an attempt toincrease the number of instructions executed per unit time, but againclock rates can be limiting. A special purpose processor that executes aspecial instruction set particularly suited to TCP/IP protocolprocessing can be employed to do more processing with a given executedinstruction, but such a processor still requires sequential fetching ofinstructions. Another technique that might be employed involves using asuperscalar processor that executes multiple instructions at the sametime. But again, TCP/IP protocol processing tasks often involve manydifferent functions that are done in sequential fashion. Even with aspecial instruction set and a superscalar processor, it still may benecessary to increase clock rates beyond possible rates in order to meetthe throughput demands imposed by future network speeds. It isenvisioned that supporting the next generation of high-speed networkswill require pushing the clock speeds of even the most state-of-the-artprocessors beyond available rates. Even if employing such an advancedand expensive processor on a TOE were possible, employing such aprocessor would likely be unrealistically complex and economicallyimpractical. A solution is desired.

SUMMARY

A network interface device (NID) is capable of offloading a hostcomputer of TCP protocol processing tasks. The NID is sometimes called aTOE (TCP Offload Engine) device.

In a first aspect, the NID involves a first memory, a second memory, andcombinatorial logic. The first memory stores and simultaneously outputsmultiple TCP state variables. These TCP state variables may, forexample, include: a receive packet sequence limit number, an expectedreceive packet sequence number, a transmit sequence limit number, atransmit acknowledge number, and a transmit sequence number. The secondmemory stores and simultaneously outputs values from the header of anincoming packet. These header values may, for example, include: areceive packet sequence number; a packet payload size number, a packetacknowledge number, and a packet transmit window number.

The combinatorial logic simultaneously receives: 1) the TCP statevariables from the first memory, and 2) the header values from thesecond memory all simultaneously. From at least two TCP state variablesand at least two header values, the combinatorial logic generates aflush detect signal. The flush detect signal is indicative of whether anexception condition (for example, an error condition) has occurredrelated to the TCP connection to which the TCP state variables pertain.The combinatorial logic is, for example, part of a state machine wherethe flush detect signal is determined from the TCP state variables andthe header values in a single clock period of the clock signal thatcauses the state machine to transition from state to state.

In contrast to a sequential processor that would fetch instructions,decode the instructions, and then execute the instructions in order toperform multiple reads of TCP state variables from a memory and toperform multiple reads of header variables from a memory and then to usethese read values and variables to determine whether the flush conditionexists, the combinatorial logic of the NID is supplied with all the TCPstate variables and header values necessary to make the determination atthe same time. Accordingly, retrieving the TCP state variables andheader values does not involve a large number of time consumingsequential memory accesses. Moreover, the state machine does not fetchinstructions or execute instructions. No sequential fetching ofinstructions is necessary. No complex pipelining of instruction decodingis required. Still further, the determination of whether a flushcondition exists is performed by high-speed combinatorial hardware logicto which the TCP state variables and header values are supplied asinputs. Rather than performing this test in software as a sequence ofdiscrete logic tests, the combinatorial hardware logic performs thetests in parallel at hardware logic gate speeds.

In a second aspect, a NID includes a state machine that updates multipleTCP state variables in a TCB buffer all simultaneously, thereby avoidingsequential instruction execution to perform this task and avoidingsequential writing of multiple values to a memory structure. Bottlenecksassociated with writing to a memory structure through a narrow data busare avoided. The memory structure holding the TCP state variables isvery wide and allows all the values to be written into the memory at thesame time such that the TCP state update is performed in one or a verysmall number of memory write operations.

In a third aspect, a NID involves a state machine and a DMA controller.The state machine uses the DMA controller to transfer information to,and receive information from, a host computer. To set up the DMAcontroller to make a move of information, the source of the informationis supplied to the DMA controller, the amount of information to move issupplied to the DMA controller, and the destination address where theinformation is to be placed is given to the DMA controller. In contrastto a sequential processor NIC card design where multiple sequentialsteps are required to set up a DMA transfer, the state machine of theNID in accordance with the third aspect simultaneously supplies the DMAcontroller with the source of the information to move, the amount ofinformation to move, and the destination address where the informationis to be placed all at the same time within one state of the statemachine's operation.

The architectural aspect of storing TCP state variables and packetsheaders in a very wide memory structure in such a way that thesevariables and headers are accessed at one time in parallel and areprocessed by a hardware state machine, and such that the resultingupdated TCP state variables are all written back to the wide memory inparallel in one or a very small number of memory writes is applicablenot only to systems where control of a TCP connection is passed back andforth between a TOE device and a host, but it is also applicable tosystems where a TOE remains in control of a TCP connection and wherecontrol of the TCP connection is not transferrable between the TOE andthe host. By employing novel TOE architectural aspects set forth in thispatent document, the number of packets processed per unit time can beincreased without increasing the maximum clock speed or, alternatively,if the number of packets to be processed per unit time is to remainconstant then the maximum clock speed can be reduced. Reducing clockspeed for a given amount of processing throughput reduces powerconsumption of the overall TOE system. Moreover, the novel TOEarchitecture allows memory access bandwidth requirements to be relaxedfor a given amount of packet processing throughput, thereby allowingless expensive memories to be used and further reducing TOE system cost.

Other embodiments and details are also described below. This summarydoes not purport to define the invention. The claims, and not thissummary, define the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 (Prior Art) is a diagram of a system in which a network interfacedevice (NID) operates in accordance with the present invention.

FIG. 2 is a simplified block diagram of an application specificintegrated circuit (ASIC) of the NID of FIG. 1. All the read and writestrobe signals and all the control signals between the various blocksthat are necessary to carry out the data movements described are notillustrated in the diagram because including all those signals wouldobscure the information flow being explained and would result in a largediagram that would be difficult to read in a printed patent document.

FIG. 3 is a table that sets forth the mnemonics used to refer to thevarious blocks of FIG. 2.

FIG. 4 is a table that sets forth how each of the blocks of FIG. 2 isorganized.

FIG. 5 is a table that sets forth a brief description of each mnemonicused in the table of FIG. 4.

FIG. 6 is a table that sets forth the various variables that make upeach variable in the left column of the table.

FIG. 7 sets forth in hardware description pseudocode what the variousblocks of FIG. 2 do to perform slow-path receive processing of a packet.

FIG. 8 sets forth in hardware description pseudocode what the variousblocks of FIG. 2 do in a connection handout sequence.

FIG. 9 sets forth in hardware description pseudocode what the variousblocks of FIG. 2 do to perform fast-path receive processing of a packet.

FIG. 10 sets forth in hardware description pseudocode what occurs ineach of the states of the socket engine SktEng state machine of FIG. 2.

FIG. 11 is a diagram illustrating of the organization of a host messagebuffer.

FIG. 12 sets forth in hardware description pseudocode what the variousblocks of FIG. 2 do in a connection flush sequence.

FIG. 13 sets forth in hardware description pseudocode what thecombinatorial logic of the socket engine SktEng state machine does toperform incoming acknowledge (ACK) processing.

FIG. 14 is a diagram that illustrates various relationships between thevarious variables used in FIG. 13.

FIG. 15 sets forth in hardware description pseudocode what thecombinatorial logic of the socket engine SktEng state machine does toperform incoming data processing.

FIG. 16 is a diagram that illustrates various relationships between thevarious variables used in FIG. 15.

FIG. 17 sets forth the format of a header buffer.

FIG. 18 sets forth the various parts of the FRAME STATUS A component ofthe header buffer of FIG. 17.

FIG. 19 sets forth the various parts of the FRAME STATUS B component ofthe header buffer of FIG. 17.

FIG. 20 sets forth the various component values that make up a TCB andthat are stored in a TCB buffer.

DETAILED DESCRIPTION

FIG. 1 is a simplified diagram of an exemplary system 1 used toillustrate the present invention. System 1 is coupled to apacket-switched network 2. Network 2 can, for example, be a local areanetwork (LAN) and/or a collection of networks. Network 2 can, forexample, be the Internet. Network 2 can, for example, be an IP-based SANthat runs iSCSI. Network 2 may, for example, be coupled to system 1 viamedia that communicates electrical signals, via fiber optic cables,and/or via a wireless communication channel.

System 1 includes a host 3 and a network interface device (NID) 4. Host3 may, for example, be embodied on a motherboard. NID 4 may, forexample, be an expansion card that couples to the motherboard. Host 3includes a central processing unit (CPU) 5 or CPU chip-set, and anamount of storage 6. In the illustrated example, storage 6 includes acombination of semiconductor memory and magnetic disc storage. CPU 5executes software stored in storage 6. The software includes a networkprotocol processing stack including a media access protocol processinglayer, an IP protocol processing layer, a TCP protocol processing layer,and an application layer. The protocol layer on top of the TCP protocolprocessing layer is sometimes called a session layer and is sometimescalled an application layer. In the description below, the layer sreferred to as the application layer.

NID 4 is coupled to host 3 via host bus 7, a bridge 8, and local bus 9.Host bus 7 may, for example, be a PCI bus or another computer expansionbus. NID 4 includes an application specific integrated circuit (ASIC)10, an amount of dynamic random access memory (DRAM) 11, and PhysicalLayer Interface (PHY) circuitry 12. NID 4 includes specialized protocolprocessing acceleration hardware for implementing “fast-path” processingwherein certain types of network communications are accelerated incomparison to “slow-path” processing wherein the remaining types ofnetwork communications are handled at least in part by the protocolprocessing stack executing on host 3. In the illustrated embodiment, thecertain types of network communications accelerated are TCP/IPcommunications that meet certain fast-path criteria. NID 4 is thereforesometimes called a TCP Offload Engine (TOE).

For additional information on systems that perform fast-path andslow-path processing, see: U.S. Pat. No. 6,247,060; U.S. Pat. No.6,226,680; U.S. Pat. No. 6,247,173; Published U.S. Patent ApplicationNo. 20010021949; Published U.S. Patent Application No. 20010027496;Published U.S. Patent Application No. 20010047433; and U.S. patentapplication Ser. No. 10/420,364 (the content of each of theabove-identified patents, published patent applications and patentapplications is incorporated herein by reference). System 1 of FIG. 1employs techniques set forth in these documents for transferring controlof TCP/IP connections between a host protocol processing stack and anetwork interface device. That information on how control of a TCP/IPconnection is passed from host to network interface device and fromnetwork interface device back to the host is expressly incorporated byreference into this document as background information.

FIG. 2 is a block diagram of ASIC 10 of FIG. 1. ASIC 10 includes areceive MAC block (RcvMac) 20, a receive parser block (RcvPrs) 21, asocket detector block (SktDet) 22, a receive manager block (RcvMgr) 23,a socket engine block (SktEng) 24, a DMA controller/PCI bus interfaceunit block 25, a DRAM controller block 26, an SRAM block 27, a transmitformat sequencer block (XmtFmtSqr) 28, and a transmit MAC block (XmtMac)29. The remaining blocks indicate memory structures such as buffers andqueues. A block labeled with a label ending in the letter “Q” indicatesthe block is a queue. A block labeled with a label ending in the letters“Buf” indicates the block is a buffer or a set of buffers. The queuesand buffers are used to pass information between the various otherblocks.

Operation of system 1 is described below by describing: 1) a slow-pathreceive sequence, 2) a connection handout sequence wherein control of aconnection is passed from the host to the NID, 3) a fast-path receivesequence wherein TCP and IP protocol processing of packets received overthe handed out connection is performed on the NID, 4) operation of thesocket engine, and 5) a connection flush sequence wherein control of aconnection is passed from the NID to the host. In these descriptions,mnemonics are used to refer to the various blocks of FIG. 2.

FIG. 3 is a table that sets forth the mnemonics used to refer to thevarious blocks of FIG. 2. Each of different types of memory blocks inFIG. 2 contains information that is formatted in a certain fashion.

FIG. 4 is a table that sets forth how each of the memory blocks of FIG.2 is organized. The first row of FIG. 4, for example, indicates that theparticular header buffer (HdrBuf) entry addressed by HdrId contains aparse header (PrsHd) followed by a packet header (PktHd). HdrBuf is amnemonic that stands for header buffer, HdrId is a mnemonic that standsfor header buffer identifier, PrsHd is a mnemonic that stands for parseheader, and PktHd is a mnemonic that stands for packet header.

FIG. 5 is a table that sets forth a brief description of each mnemonicused in the table of FIG. 4. The items of information set forth in FIG.5 are sometimes called “variables” in the description below. A variablemay itself be made up of multiple other variables.

FIG. 6 is a table that sets forth the various variables that make upeach variable in the left column of the table. The first row of FIG. 6,for example, indicates that a socket tuple (SkTpl) is made up of atransmit acknowledge number (XmtAckNum), a transmit sequence number(XmtSeqNum), a transmit sequence limit (XmtSeqLmt), a transmitcongestions control window size (XmtCcwSz), a maximum segment size(MaxSegSz), a maximum transmit window (MaxXmtWin), a receive sequencelimit (RcvSeqLmt), an expected receive sequence number (RcvSeqExp), anexpected header length (ExpHdrLen), and a transmit control block address(TcbAd).

Slow-Path Packet Receive Sequence:

FIG. 7 sets forth, in hardware description pseudocode, what the variousblocks of FIG. 2 do to perform slow-path processing of a packet. A linein the pseudocode is referred to by the line number that appears inparentheses to the left of the line.

Initially, a packet of network information enters ASIC 10 from PHY 12 asindicated by arrow 30 in FIG. 2. In FIG. 7, lines 701-703 indicateactions taken by the RcvMac block (RcvMac) 20. As indicated in line 701,RcvMac 20 parses the incoming packet RcvPk and looks at preambles andpostambles coming off the network. RcvMac 20 checks the CRC value at theend of the packet, detects whether there was a collision after thepacket started being received onto RcvMac, and performs other standardfunctions. RcvMac 20 generates a status word RcvSt indicating theresults of this checking. The RcvMac looks at destination link addressesand based on this information filters out packets that are not destinedfor NID 4. In one embodiment, the RcvMac is a commercially availableblock of circuitry designed by another company. If the packet isdestined for NID 4, then RcvMac 20 puts a corresponding entry on thefirst-in-first-out RcvMacQ 31. The pushing of the entry includes thepacket RcvPk as well as the status word RcvSt as indicated by lines 702and 703.

The receive parser block (RcvPrs) 21 pops the entry off the RcvMacQ. InFIG. 7, lines 705-708 indicate actions taken by RcvPrs block 21. Butbefore RcvPrs 21 pops the entry, it identified a location in DRAM 11where the information will be stored. All packets passing through NID 4are temporarily stored in DRAM 11. DRAM 11 is 2M bytes in total and issectioned into 1024 buffers. Each buffer is 2048 bytes in length each.Each buffer can therefore be identified by a 10-bit buffer identifier(BufId) that identifies the starting address of the buffer. To obtainthe 21-bit starting address of the corresponding buffer, eleven zerosare added to the right of BufId. Storing the BufId without the elevenzeros (that would always be zeros anyway) saves memory space.

A queue called the receive buffer queue RcvBufQ (see FIG. 2) is a queuethat stores BufIds of DRAM buffers that are free and available for useby RcvPrs block 21. RcvPrs block 21 pops the RcvBufQ and obtains a BufIdas indicated by line 705. RcvPrs receives the BufId value, and shiftsthe value eleven bits to the left to obtain the starting address of thebuffer in DRAM. Once RcvPrs has a place to put the packet, RcvPrs startsparsing the entry pulled off the RcvMacQ (line 706) and generates aparse header PrsHd.

The constituents of this parse header PrsHd are indicated in FIG. 6. Foradditional, more detailed information, see FIGS. 18 and 19. Frame statusA and frame status B together comprise the PrsHd. Bit 23 of frame statusA is the attention bit Attn. The Attn bit indicates, among otherexception and error conditions, if the transport and network layerprotocols of the packet are not TCP and IP.

One of the constituents of the parse header PrsHd is a header lengthcode HdrCd. The header length code HdrCd indicates how deep the headersextend into the packet that was received. As indicated by FIG. 4, theRcvMacQ entry includes the receive packet RcvPk and receive statusRcvSt. Returning to FIG. 7, RcvPrs block 21 prepends (line 707) theparse header PrsHd onto the packet RcvPk and writes this informationinto the receive buffer RcvBuf identified by the value BufId (BufId waspopped off the RcvBufQ in line 705). RcvPrs 21 performs this write byputting the information to be written into PrsWrDatQ and then putting aninstruction to DRAM controller 26 onto PrsWrReqQ. The instruction onPrsWeReqQ tells DRAM controller 26 to move data out of PrsWrDatQ and toput it into the appropriate buffer in DRAM. The request instructionindicates a number of bytes to pull off the PrsWrDatQ and also indicatesa starting address in DRAM. DRAM controller 26 at some time later popsthe request instruction off PrsWrReqQ that tells it how much to pop offthe PrsWrDatQ and where to write the data. The designation in line 707of “write RcvBuf [BufId] {PrsHd, RcvPk}” means write to the “receivebuffer” identified by the value BufId, and to write the parse headerPrsHd prepended to the RcvPk. The { } symbols enclose the information towrite. The bracket symbols [ ] enclose a value that indicates whichreceive buffer within RcvBuf it is that will be written.

Next, in line 708, RcvPrs block 20 writes an entry onto the parse eventqueue PrsEvtQ 32 (see FIG. 2). The entry includes a packet end indicatorPkEnd that alerts the socket detector block SktDet 22 of the arrival ofthe packet and the next free location (PkEnd) that can be written to inthe buffer. PkEnd “Packet end” is a pointer that points to the end ofthe packet in DRAM (the beginning address of the buffer in DRAM, plusthe size of the packet, plus the size of the parse header). From PkEnd,the socket detector SktDet extracts the original buffer ID that pointsbeginning to the buffer and also extracts the size of the packet. Thereceive parser RcvPrs also includes a socket hash (SkHsh), and thesocket descriptor (SktDsc) in the entry that it pushes onto PrsEvtQ. Asindicated by FIG. 6, the socket descriptor SktDsc includes: a headerlength code (HdrCd), the IP source address (SrcAd), the IP destinationaddress (DstAd), the TCP source port (SrcPt), and the TCP destinationport (DstPt). Each socket descriptor includes a DetEn bit.

In a prior TOE device, the socket descriptor and hash were prependedonto front of the packet where the packet is stored in DRAM. To performa TCB look up, the front part of the packet where the socket descriptorand hash were stored had to be first transferred out of DRAM to get thehash and socket descriptor necessary to perform the TCB lookup. Incontrast to such a prior TOE device, the present TOE device passes thehash straight to the socket detector via the PrsEvtQ. Each PrsEvtQ entrycontains PkEnd, SkHsh and SktDsc (see line 708). The packet is left inDRAM. Passing the hash and socket detector straight to the socketdetector avoids having to transfer the front of the packet out of DRAMbefore the socket detector can perform the TCB lookup operation.Reducing the amount of DRAM accessing in this way allows the sparedavailable DRAM access bandwidth to be used for other purposes.

Lines 710-712 in FIG. 7 indicate the operation of the socket detectorblock SktDet 22. SktDet block 22 pops the PrsEvtQ as indicated by line710 and uses the retrieved hash SkHsh to see if the retrieved socketdescriptor SktDsc is one of up to 4096 socket descriptors stored in asocket descriptor buffer SktDscBuf. SktDscBuf (see FIG. 2) is adual-port memory that has 1024 buckets, where each bucket contains foursocket descriptor buffers, and where each buffer can store one socketdescriptor. If a TCP/IP connection is being handled in fast-path, then asocket descriptor for the connection will be stored in the SktDscBuf.

The hash SkHsh retrieved from the entry popped off the PrsEvtQidentifies one of the 1024 hash buckets. The socket detector blockSktDet uses this SkHsh to identify the proper hash bucket and thenchecks each of the possible four socket descriptors stored in the hashbucket to see if one of those stored socket descriptors matches thesocket descriptor retrieved from the PrsEvtQ.

In the presently described example, the packet is a slow-path packet.Accordingly, the “test” of line 711 results in no match. Socket detector(SktDet) then writes into DetEvtQ a slow code SlwCd, the PkEnd value,and a header length code HdrCd. The SlwCd is a two-bit value thatindicates that the packet is a slow-path packet (i.e., there was nosocket descriptor match). The header code HdrCd is a 2-bit codegenerated by the RcvPrs that indicates how deep the headers go in theparticular packet. The headers extend to different depths into a packetdepending on the type of packet received. The length of the headervaries because the packet may or may not have certain headers like 802.3headers, snap header, VLAN header, RDMA header, ISCSI header. HdrCdindicates how much of the packet needs to be DMA'd into the headerbuffer HdrBuf from DRAM 11 to make sure all the appropriate headers ofthe packet have been transferred. To conserve bandwidth on DRAM 11,either 64 bytes, or 128 bytes, or 192 bytes, or 256 bytes aretransferred from DRAM 11 to the header buffer HdrBuf. The amounttransferred is the smallest amount that still results in the headersbeing transferred. The socket engine SktEng block 24 only needs to lookat the headers and does not have to look at the packet payload.

The operations performed by receive manager RcvMgr block 23 are setforth in lines 714 and 715 of FIG. 7. RcvMgr pulls the entry off theDetEvtQ. As indicated by FIG. 4, an entry on the DetEvtQ can have one offour formats. The entry can, for example, start with a slow code SlwCdto indicate that the associated packet is to be handled in slow-path.Alternatively, the entry can start with a fast code FstCd to indicatethat the associated packet is to be handled in fast-path. A DetEvtQentry can also start with an enable code EnbCd or a disable code DblCd.

There is a slow path event queue SlwEvtQ through which the receivemanager RcvMgr communicates slow-path events to the socket engine SktEngfor further processing. There is also a fast path event queue FstEvtQthrough which the receive manager RcvMgr communicates fast-path eventsto the socket engine SktEng for further processing. Each of these queueshas a ready bit. This bit indicates to the socket engine SktEng that thequeue has an entry to be popped.

In the present example, the RcvMgr block 23 detects the presence of aslow code SlwCd in the DetEvtQ entry. The RcvMgr block 23 thereforemerely writes the entry back out on the slow event queue SlwEvtQ. Thevarious variables of an entry that is written onto the SlwEvtQ are setforth in FIG. 4.

The socket engine SktEng is in its idle state and detects the ready bitof the SlwEvtQ. It pops the entry off the SlwEvtQ as indicated in line717. (Also see line 1001 of FIG. 10 that will be explained later on inthe description of the socket engine states). From the slow code SlwCdin the entry popped, the socket engine SktEng knows it has to just passthe associated packet off to host 3 because in slow-path processing thehost stack performs TCP and IP protocol processing.

NID 4 passes information to host 3 by a message buffer mechanism. Bythis mechanism, host 3 maintains a queue of message buffer identifiersin a message buffer queue MsgBufQ (see FIG. 2) on NID 3. Each of thesemessage buffer identifiers indicates a starting address of a freemessage buffer in host storage 6. NID 4 can use these message buffers topass information to the host stack.

The socket engine SktEng therefore pops MsgBufQ and obtains a messagebuffer address MsgAd of a free message buffer (line 718). (Also see line1079 of FIG. 10 that will be explained later on the description of thesocket engine states). The socket engine SktEng then copies (line 719)the parse header PrsHdr and the packet RcvPk from the receive bufferidentified by BufId to host memory HstMem at the host memory locationdictated by the message address MsgAd plus an offset MsgHdrLen. Thevalue BufId here is extracted from the PkEnd (see line 717) becausePkEnd is a concatenation of BufId and PkLen. The slow path packet isthen in host memory 6 in the message buffer.

A message header is then to be sent to the host to inform the host thatthe message buffer contains a slow path packet. This message header isto be appended to the front of the message buffer. To do this (see line720), the message header MsgHd is written into host memory starting atMsgAd so that the message header is prepended to the RcvPk and PrsHd inthe message buffer starting at address MsgAd. (Also see line 1087 ofFIG. 10 that will be explained later on the description of the socketengine states). With the packet out of DRAM 11, the DRAM buffer is freedup for storing other packets coming in off the network. This is done asindicated by line 721 by the socket engine SktEng pushing the BufId ofthe now free DRAM buffer onto the RcvBufQ of free buffers.

Once the packet and the message header are present in the host messagebuffer identified by MsgAd, the host stack examines the host messagebuffer (either due to an interrupt or due to host processor polling),retrieves the message header, determines that the message buffercontains a slow-path packet, and performs any necessary TCP and IPprotocol processing on the packet.

Connection Handout Sequence:

NID 4 performs all or substantially all TCP and IP protocol processingon certain types of network communications, thereby offloading the hoststack of this task. In the present example, a TCP connection is set upby the protocol stack executing on host 3. Once the TCP connection isset up, control of the TCP connection can be passed from host 3 to NID4. After control of the TCP connection has passed to NID 4, NID 4performs all TCP and IP protocol processing on subsequent packetscommunicated across the TCP connection provided that certain errorconditions do not occur. The packet payloads of subsequent packetsreceived over the connection are written by NID 4 directly into a finaldestination in application memory on host 3. The packet payloads areplaced in the final destination free of any TCP or IP network headers.For control of the TCP connection to be passed from host 3 to NID 4,control of a TCB (TCP control block, sometimes called transactioncontrol block or transmission control block) of information for that TCPconnection is passed from host 3 to NID 4. The TCB includes informationon the state of the TCP connection. Passing control of the TCB for theconnection in this embodiment involves actually transferring a part ofthe TCB information from host 3 to NID 4. Control of a TCP connection ispassed from host 3 to NID 4 in a two-phase “connection handout” processset forth below.

First Phase (slow-path purge): In a first phase of the connectionhandout process, host 3 tentatively passes control of the TCP connectionto NID 4. NID 4 places a purge marker into the flow of slow-path packetsreceived onto NID 4 for that connection, and causes NID 4 to holdsubsequently received incoming packets for that connection withoutpassing them on to host 3. Host 3 continues to process the flow ofslow-path packets for that connection. When host 3 receives the purgemarker, host 3 knows that the slow-path flow of packets has been purgedand that it will receive no more packets for that connection. The firstphase of the connection handout process is complete, and the host isthen free to pass the TCP state to NID 4.

FIG. 8 sets forth a pseudocode hardware description of what the variousblocks of FIG. 2 do to perform the connection handout. As indicated inline 801, host 3 builds in host memory, at location CmdAd, a receivecommand (i.e., a purge receive command). The purge receive commandincludes a command header CmdHd that indicates a purge receive commandand that includes the address TcbAd on the host where the TCB of theconnection is stored. As indicated by line 802, host 3 then notifies NID4 of the purge receive command by writing a pointer (CmdAd) and anindication of the connection (TcbId) into the HstEvtQ on NID 4. Thepointer CmdAd points to the receive command in host memory. The TcbIdindicates which connection NID 4 is to start holding packets for.

The socket engine SktEng block 24 is in its idle state and detects theHstEvtQ being ready. The socket engine therefore pops (line 804) thehost event queue HstEvtQ and retrieves the command address CmdAd andTcbId. (Also see lines 1049 thru 1052 of FIG. 10 that will be explainedlater on in the description of the socket engine states).

From the command address CmdAd, the socket engine SktEng causes DMAcontroller block 25 to retrieve the receive command (command headerCmdHd and TcbAd). The socket engine SktEng looks at the command codeCmdCd in the command header CmdHd and sees that this is a “SocketDescriptor Import” command (also called a “Tcb Import” command eventhough it results in the socket descriptor part of the TCB beingimported but not the TCP state part of the TCB). See lines 1144 and 1145of FIG. 10 that will be explained later on in the description of thesocket engine states.

The socket engine SktEng uses the TcbAd to copy (line 806) the socketdescriptor SktDsc from the TCB in host memory HstMem and puts the socketdescriptor SktDsc into the particular socket descriptor buffer SktDscBufon NID 4 identified by TcbId. As set forth above, there is one socketdescriptor buffer on NID 4 for each Tcb that is being fast-pathprocessed by NID 4. As indicated by FIG. 6, the socket descriptor SktDscincludes a header code HdrCd, the TCP source port, the TCP destinationport, the IP source address, and the IP destination address, and adetect enable bit DetEn. (Also see lines 1153-1159 of FIG. 10 that willbe explained later on in the description of the socket engine states).

Once the socket descriptor for the connection is loaded into the socketdescriptor buffer SktDscBuf on NID 4, the socket engine SktEng sends(line 807) a detect enable command EnbCd to the socket detector SktDetblock 22. The detect enable command EnbCd instructs the socket detectorblock SktDet 22 to start detecting packets for this connection. Thedetect enable command is sent as indicated in line 807 by the socketengine SktEng writing a socket enable code EnbCd and the TcbId into thedetect command queue DetCmdQ. (Also see lines 1161-1162 of FIG. 10 thatwill be explained later on in the description of the socket enginestates).

The socket detector SktDet block 22 pops the DetCmdQ (line 809),retrieves the socket enable code EnbCd and TcbId,. The socket detectorSktDet then “sets” (line 810) a detect enable bit DetEn in theparticular socket descriptor buffer identified by SktDscBuf[TcbId]. Thisdetect enable bit DetEn is a bit in the socket descriptor SktDsc thattells the socket detector that the next time it receives a packet and itcompares the Tcb with the headers in the packet, it can indicate thematch in the parse header PrsHd. If detect enable bit is not set, thenthe socket detector will not indicate a match, even if the Tcb comparewith the packet headers do compare. (The detect enable bit prevents thesocket detector SktDet from erroneously determining that an incomingpacket should be handled in fast-path because a match to an invalidsocket descriptor entry was used.)

The socket detector SktDet block 22 writes an enable code EnbCd purgemarker along with the TcbId into in the detect event queue DetEvtQ (line811).

The receive manager RcvMgr reads (line 813) the DetEvtQ and gets theEnbCd purge marker and TcbId. The RcvMgr sticks the EnbCd purge markerinto the flow of packets for the connection identified by the TcbId bywriting the EnbCd and TcbId into the slow event queue SlwEvtQ (line814). Accordingly, in addition to entries that contain PkEnd values thatpoint to buffers in DRAM where slow-path packets are stored, entrieshaving the format of purge markers can also go onto this SlwEvtQ. Forthe format of such an SlwEvtQ entry containing an EnbCd purge marker,see FIG. 4.

Receive descriptors, for any packets corresponding to this Tcb, thatwere received before the Tcb handout was initiated, will therefore havebeen in the SlwEvtQ before the EnbCd purge marker, and they will behandled in normal slow-path fashion by being sent to the host stackbefore the purge marker is popped off the SlwEvtQ.

Receive descriptors for packets received after the purge marker, on theother hand, will be held on NID 4 by putting them in a socket receivequeue SktRcvQ (see FIG. 2). The receive descriptors for such packetstack up in this SktRcvQ. The actual packets are stored as usual by thereceive parser RcvPrs in DRAM receive buffers, but these DRAM receivebuffers are pointed to by the PkEnd variables in the receive descriptorsthat are held up on the SktRcvQ.

The socket engine SktEng is in its idle state and detects an entry onthe SlwEvtQ. The socket engine SktEng therefore reads (line 816) theSlwEvtQ and gets the EnbCd purge marker and TcbId. The EnbCd purgemarker is a type of event code EvtCd (EvtCd==1). When the socket engineSktEng sees this particular EnbCd purge marker, the SktEng obtains amessage buffer address MsgAd (that points to a message buffer MsgBufthat NID 4 can use to send a message to host 3 indicating the completionof phase one of the handout). SktEng therefore pops the MsgBufQ toobtain a MsgAd as indicated in line 817. The SktEng block 24 causes theDMA controller 25 to write a message header MsgHd into a message bufferon the host identified by the MsgAd. The message header MsgHd hereindicates an enable mark message EnbMrkMsg. (Also see lines 1094-1100 ofthe enable mark event state of FIG. 10 that will be explained later onin the description of the socket engine states).

All messages from NID 4 to host 3 go through a virtual queue of messagebuffers in the host memory. As set forth above, host 3 hands out thestarting addresses of free message buffers to NID 4 via the MsgBufQ.Host 3 also keeps track of the order in which the message bufferaddresses were handed out to NID 4. The host stack examines the messagebuffers in exactly the order that the message buffers were handed out.The purge marker therefore flows through to the host stack in the formof a message header MsgHd in this virtual queue. This message headerMsgHd purge marker tells the host stack that the host receive command(read by the socket engine in line 805) has been completed by NID 4.

Second Phase (load socket state): Host 3 reads (line 820) the MsgHdpurge marker from NID 4 by reading HstMem at location MsgAd where NID 4placed the message header. Upon detecting that the first phase of theconnection handout has been completed, host 3 writes (line 821) thesocket state in the form of the socket tuple SkTpl into a portion ofhost memory identified by TcbAd. As indicated by FIG. 6, the sockettuple SkTpl contains the state information of the TCB for theconnection. Host 3 also writes (line 822) a socket receive commandSktRcvCmd in the form of a command header CmdHd and TcbAd into themessage buffer identified by CmdAd. Host 3 writes (line 823) an entrycontaining the address CmdAd of the command buffer and the TcbId intothe host event queue HstEvtQ on NID 4.

The socket engine SktEng 24 is in the idle state and detects the HstEvtQis ready. The socket engine therefore pops the HstEvtQ (line 825) anduses the CmdAd pointer to read (line 826) the host command (commandheader CmdHd and TcbAd) from the buffer in host memory. (Also see lines1137-1141 of FIG. 10 that will be explained later in the description ofthe socket engine states).

The TcbAd identifies the location of the socket tuple in host memory.The command is a socket enable command. The socket engine SktEngtherefore uses the TcbAd to copy (line 827) the state information SkTplfrom host memory HstMem into the Tcb buffer on NID 4 identified byTcbId. (Also see lines 1177-1183 of FIG. 10 that will be explained lateron in the description of the socket engine states).

Once the state has been loaded, the SktEng sends an “arm” code ArmCd(line 828) to the receive manager via the MgrCmdQ. (Also see line 187 ofFIG. 10 that will be explained later in the description of the socketengine states).

The receive manager RcvMgr maintains an event enable EvtEn bit in aone-bit wide MgrBuf (see FIG. 2). There is one such event enable bit foreach Tcb controlled by NID 4. If the event enable bit EvtEn for a Tcb isset, then the RcvMgr is “armed” to send one fast-path event to thesocket engine SktEng for that Tcb. If there is no event for that Tcbqueued on the SktRcvQ, then the receive manager RcvMgr waits until theSktRcvQ has an entry for that TcbId. When a receive descriptor is in theSktRcvQ for that TcbId as indicated by SktRcvQRdy[TcbId], then thereceive manager RcvMgr pops the SktRcvQ for that TcbId and uses thatentry to generate one fast-path event for the socket engine SktEng toprocess for that TcbId.

Accordingly, the receive manager RcvMgr pops the MgrCmdQ (line 830), andretrieves the ArmCd and TcbId. If there is a receive descriptor for thisTcbId on SktRcvQ, then SktRcvQRdy[TcbId] will be true. The receivemanager RcvMgr pops the SktRcvQ for that Tcb and moves the event to thefast-path event queue (line 832) by writing a FstCd and the TcbId ontothe FstEvtQ.

If there is not presently a receive descriptor for this TcbId on theSktRcvQ, then SktRcvQRdy[TcbId] is false. The receive manager RcvMgrtherefore sets the EvtEn bit in MgrBuf for the TcbId as indicated byline 834. The EvtEn bit is set so that the next time a receivedescriptor is found on the SktRcvQ for the TcbId, the receive managerRcvMgr will pop it off the SktRcvQ for that TcbId and generate (see line919 in FIG. 9) a fast-path event on the FstEvtQ, and then clear theEvtEn bit in MgrBuf for the TcbId.

It is therefore recognized that the EvtEn bit is used to implement aone-time arm command mechanism because when a fast-path event isreturned by the receive manager RcvMgr to the socket engine SktEng, theEvtEn bit is cleared. Setting the EvtEn bit arms the receive managerRcvMgr to send one and only one fast-path event to the socket engineSktEng. This one-time arm command mechanism is provided so that thesocket engine SktEng will process a packet on a fast-path connection(TcbId) only after it has completed processing the prior packet on thatfast-path connection.

Upon conclusion of the actions set forth in FIG. 8, communicationsacross the TCP connection identified by TcbId are being handled by NID 4in fast-path. Receive descriptors for packets on the fast-path TCPconnection are passed from receive manager RcvMgr to the socket engineSktEng via the fast event queue FstEvtQ for fast-path processing.

Fast-Path Packet Receive Sequence:

First Fast Path Receive Packet (the first one that sets up the finaldestination):

A packet of network information enters ASIC 10 from PHY 12 as indicatedby arrow 30 in FIG. 2. Processing through the receive Mac block RcvMac20 and the receive parser block RcvPrs 21 (see lines 900-908 of FIG. 9)occurs in much the same way as in the slow-path receive sequence, exceptthat in the case of a fast-path receive sequence, the connection handoutsequence described above in connection with FIG. 8 has already occurred.A socket descriptor for the TCP connection is already in the socketdescriptor buffer SktDscBuf, and the socket tuple SkTpl that holds thestate of the TCP connection is already in the Tcb buffer.

When the socket detector SktDet block 22 retrieves the socket hash SkHsh(line 910) and does the test SktDscBuf (line 911), a match is detected.The socket detector SktDet therefore writes a fast code FstCd into thedetect event queue DetEvtQ (line 912) rather than a slow code SlwCd. Asindicated by lines 912 and 913, two entries are pushed onto the DetEvtQso that the fast code FstCd, a header code HdrCd, the packet end pointerPkEnd, and the TcbId are passed to the receive manager RcvMgr. Twoentries are required because DetEvtQ is only 32-bits wide. When thereceive manager RcvMgr receives a FstCd, it knows that the next entry isthe TcbId so it automatically pops that second entry off the DetEvtQ aswell as the first entry.

As indicated by lines 915 and 916, the receive manager RcvMgr block 23pops the DetEvtQ twice and retrieves the entries placed there by thereceive parser RcvPrs.

The receive manager RcvMgr maintains a first-in-first-out socket receivequeue SktRcvQ. SktRcvQ is actually a plurality of queues, one for eachTcbId. The SktRcvQ for a TcbId holds detect event descriptors for theTcbId. In the case where the receive manager RcvMgr block 23 is notarmed to send a fast-path event to the socket engine SktEng and theRcvMgr receives an event from the DetEvtQ, the RcvMgr block 23 pushesthe fast-path event onto the SktRcvQ for that TcbId. Accordingly, asindicated by line 917, the receive manager RcvMgr in this example writesa SlwCd to the SktRcvQ for the TcbId. The RcvMgr checks the MgrBuf forthat TcbId to see if the EvtEn bit for the TcbId is set. If the EvtEnbit is set, then the RcvMgr puts a FstCd and TcbId in the FstEvtQ andclears the EvtEn bit (line 920). The EvtEn bit is cleared because it isa one-shot and one fast-path event has just been forwarded to the socketengine SktEng.

The socket engine is in its idle state when it detects the fast eventqueue being ready. The socket engine SktEng therefore pops (line 923)the FstEvtQ and retrieves the FstCd and TcbId. It then requests theheaders for the fast-path packet by writing a ReqCd, HdrId, and TcbIdonto the MgrCmdQ for the receive manager RcvMgr. (The Tcb buffer for theTcbId contains state, an MDL entry, some work area, and then areas forstoring headers.) (Also see line 1117 of FIG. 10 that will be explainedlater in the description of the socket engine states).

The receive manager RcvMgr pops (see line 926) the MgrCmdQ and retrievesthe ReqCd, HdrId and TcbId. Using the TcbId, it pops (see line 927) theSktRcvQ and gets a receive event descriptor for the current TcbId. Itthen uses the HdrCd and PkEnd to instruct the DRAM controller totransfer (see line 928) the parse header PrsHd and packet header PkHdfrom the DRAM receive buffer identified by BufId into the header bufferHdrBuf identified by the HdrId. HdrBuf is not a queue. Rather, there isone HdrBuf for each TcbId. The amount transferred out of the DRAMreceive buffer by the DRAM controller is based on the HdrCd. The receivemanager RcvMgr then writes (see line 929) the TcbId into the HdrEvtQ totell the socket engine SktEng that the headers have been put inHdrBuf[TcbId].

The socket engine SktEng is again in its idle state and detects theheader event queue being ready. The socket engine therefore pops theHdrEvtQ (see line 931) and retrieves the TcbId. The socket engine SktEngis thereby informed that the headers are present in HdrBuf[TcbId] andthat it has a packet for the socket identified by TcbId. The socketengine SktEng now processes the packet. It does parallel processing inchecking acknowledge, window and sequencer numbers as indicated in line933. If the packet passes these tests, then the packet will be handledin fast path and the socket engine SktEng will have to know where totransfer the data in this packet.

SktEng determines where to transfer the data as follows. The first TCPpayload data for a multi-packet session layer message includes thesession layer header, subsequent packets do not. NID 4 passes this firstTCP packet to the host stack into a virtual queue in host memory alongwith a message header that identifies the TcbId. The message headertells the stack that the packet is an initial receive packet for thisTcbId. The host stack translates the TcbId used by NID 4 into the Tcbnumber used on the host. The host stack gives the session layer headerof the packet to the application program operating on the host. Theapplication program returns a memory descriptor list (MDL) indicatingwhere to put the data payload of the session layer message. The MDLincludes a list of entries, where each entry includes an address and asize. The MDL also includes a total byte count. The host passes an MDLto NID 4 by pushing an appropriate host command pointer onto theHstEvtQ. The host passes the command to NID 4 by building the command inone of its command buffers, and then putting a pointer to the commandbuffer on the host into the HstEvtQ of NID 4. NID 4 pops the HstEvtQ,retrieves the pointer, and DMA transfers in the command from the commandbuffer on the host. NID 4 now has the address and size for the first MDLentry, as well as the total byte count. When NID 4 receives payload datafor a packet on the identified TcbId, NID 4 DMA transfers the datapayload into host storage 6 at the address identified by the MDL entry,and decrements the size value of the MDL entry for the amount of payloaddata moved. As more payload data is received, the MDL entry isdecremented until the size value expires. A second MDL entry is thenretrieved from host memory, and the process proceeds until the secondMDL entry is filled. This process continues until the total byte countfor the MDL is exhausted. When the entire MDL list of entries isexhausted, the NID 4 card informs the host of this fact via the masterevent queue MstEvtQ and sending an interrupt to the host. NID 4 and thehost then go back through the initial phase of passing an initial packetup through the host stack to the application program to get a second MDLlist back.

Accordingly, in line 934, the socket engine SktEng checks to see ifthere is a valid descriptor entry for the TcbId. There is an MDL validbit MdlVd in the Tcb buffer of each TcbId. The socket engine SktEngexamines this MdlVd bit. If it is set, then the actions set forth inlines 935-942 are carried out, otherwise the actions set forth in lines943-961 are carried out.

If there is a valid MDL entry, the socket engine SktEng reads (line 935)the header buffer Id BufId from the HdrBuf identified by HdrId. (TheBufId is part of the parse header PrsHd.) The socket engine SktEng readsthe MDL entry (called ApDsc here) that was stored in the Tcb bufferTcbBuf for this TcbId. The socket engine SktEng copies (line 937) thepacket payload PayLd from the DRAM receive buffer RcvBuf[BufId] intohost application memory HstMem[ApDsc]. The socket engine SktEng performsthe data move by putting a request onto the DRAM-to-host request queueD2hReqQ. This request causes the DRAM controller 26 to move the payloadfrom the DRAM receive buffer into the DRAM-to-host data queue D2hDatQ.The socket engine SktEng then gives the DMA controller/PCI bus interfaceunit block 25 through the MstWrReqQ a request to move the packet payloadfrom the D2hDatQ into host memory to the location identified by theApDsc (i.e. MDL entry).

Once the data transfer has been completed, the DMA controller/PCI businterface unit block 25 sends a master event back to the socketsequencer on the MstEvtQ. This master event informs the socket engineSktEng that the transfer of the packet payload into host memory has beencompleted.

Here, for simplicity purposes, it is assumed that the transfer of dataexhausted the last of the MDL entries. NID 4 therefore must notify thehost stack that the MDL entries have been filled with data. (Thisnotification involves the second host interrupt for the MDL. The firstinterrupt occurred on the first packet which resulted in the MDL to bereturned to NID 4. This is the second interrupt after the MDL has beenfilled with data.)

To notify the host, NID 4 reads (line 938) the address MsgAd of a freemessage buffer off the message buffer Q MsgBufQ. Socket engine SktEngplaces an appropriate message header MsgHd into the message buffer. Thismessage header MsgHd indicates to the host stack that data has been putin the application memory space for the TcbId. Now that data payload hasbeen moved over to the host memory, the DRAM receive buffer on NID 4 isno longer needed and is made available for receiving another packet. TheBufId of the DRAM receive buffer is therefore pushed (line 940) backonto the free buffer queue RcvBufQ.

Here in the present example, for simplicity purposes, it is assumed thatthe transfer of the data payload resulted in the completion of the MDLreceive command and the filling of the entire MDL list made available bythe host. Accordingly, there is no longer an MDL entry available to befilled for the TcbId. The MDL is therefore “invalid” and the socketengine SktEng clears the MdlVd bit in the TcbBuf[TcbId] as indicated inline 942.

If a fast-path packet is received and the MDL valid bit MdlVd is false,then the packet might be the first packet of a fast-path multi-packetsession layer message. In such a situation, because MdlVd is false,processing proceeds to lines 943-961. The entire packet (header anddata) will be passed to the host stack along with a fast code FstCd sothat the host stack can return an MDL entry for the next fast-pathpacket received on the connection. Accordingly, the socket engine SktEngretrieves an address MsgAd of a free message buffer (line 944) off themessage buffer queue MsgBufQ. The entire packet PayLd is then copied(line 945) from the DRAM receive buffer RcvBuf identified by BufId intothe general purpose message buffer on the host identified by MsgAd. Themessage header MsgHd indicates that the packet payload is of a fast-pathreceive packet, indicates the TcbId, and indicates how much informationis being transferred to the host. Once the payload PayLd has been copiedfrom the DRAM receive buffer to the host memory, the DRAM receive bufferis recycled (see line 947) by writing the DRAM receive buffer identifierBufId back onto the free receive buffer queue RcvBufQ.

The host stack retrieves the message (line 949) from the general purposehost message buffer HstMem[MsgAd] and processes the payload of themessage MsgDt (the entire packet) up through the IP protocol processinglayer and the TCP protocol processing layer and delivers the sessionlayer header to the application layer program. The application layerprogram returns an MDL list. The host stack moves the data portion ofthe session layer message of first packet into the area of host memoryidentified by the first MDL entry, and decrements the first MDL entry toreflect the amount of data now in the host memory MDL entry area. Thehost stack then supplies the first MDL entry (so decremented) to NID 4by writing a command header CmdHd and the MDL entry (called an ApDsc)into a location in host memory (line 950). The host stack then gives NID4 notice of the command by pushing (line 951) the starting address ofthe command CmdAd and the TcbId onto the host event queue HstEvtQ on NID4.

The Socket Engine SktEng pops (line 953) the HstEvtQ and retrieves theaddress of the command in host memory CmdAd and the TcbId. The commandheader CmdHd and command data (in this case ApDsc) are then transferred(line 954) into a Tcb buffer TcbBuf[TcbId]. Because an MDL entry is nowpresent on NID 4 to be filled, the socket engine SktEng sets (line 955)the MDL valid MdlVd bit in the TcbBuf identified for this connection(TcbId). The socket engine SktEng then sends an arm command to thereceive manager by writing an arm code ArmCd along with the TcbId ontothe manager command queue MgrCmdQ. This will arm the receive manager tosend one fast-path event for this TcbId to the socket engine SktEng.

The receive manager RcvMgr pops the MgrCmdQ and retrieves the armcommand ArmCd and the TcbId (line 958). If the socket receive queue hasan entry to be popped for this TcbId (line 959), then the receivemanager RcvMgr transfers (line 960) the event off the SktRcvQ for thatTcb onto the fast event queue FstEvtQ by writing a fast code FstCd andTcbId onto FstEvtQ. If, on the other hand, there is no event on thesocket receive queue for this Tcb, then the receive manager RcvMgrremains armed to send a fast-path event to the socket engine SktEng whena receive event is received by the receive manager RcvMgr for this TcbIdsometime in the future. The socket engine SktEng therefore returns toline 922 to monitor the fast event queue FstEvtQ to wait for a fast-pathevent to be passed to it from the receive manager RcvMgr.

Socket Engine States:

FIG. 10 sets forth sets forth a plurality of states through which thesocket engine SktEng state machine transitions. These states are: Idle,SlwRcvEvt, SlwRcv0, EnbMrkEvt, DblMrkEvt, DblMrk0, FstRcvEvt, ClrMrkEvt,ClrMrk0, SktCmdEvt, SktCmd0; SktEnbCmd, SktEnb0, SktArmCmd, SKtArm0,SktRcvCmd, SktRcv0, HdrDmaEvt, HdrEvt0, FastRcv, UpdMdlEntries andInitRcv. The socket engine SktEng proceeds from one state to anotherstate based on current state variables and incoming information. Statetransitions are only made at the time of a rising edge of a clock periodof the clock signal that clocks the state machine. All actions set forthwithin a definition of a state below occur simultaneously within thestate.

In the Idle state, the state machine is looking for something to do. Ifthe slow event queue has a slow event entry to be popped off (i.e.,SlwEvtQRdy is true), then the actions in lines 1001-1018 are carriedout. If the event code EvtCd is zero, then the EvtCd is the slow codeSlwCd and the actions in lines 1002-1009 are carried out. If the eventcode EvtCd is one, then the EvtCd is an enable code EnbCd and theactions in lines 1011-1018 are carried out. If the event code EvtCd isneither zero nor one, then the EvtCd is the disable code DblCd and theactions in lines 1020-1028 are carried out. The event code EvtCd is setby the socket detector SktDet block 22 as described above. The testingof the SlwEvtQ{EvtCd} bits as well as the “if, then, else” logic setforth in the Idle state is performed by digital logic hardware gates.

If the event code EvtCd is a zero (see line 1001), then processing ofthe slow-path packet is to be handed over to the host. The “E” beforethe designators EState, ETcbId, ECmdAd, EHdrCd, EHdrCd, EHdrAd, EBufIdand EPkLen indicate local registers within the socket state machine.Engine state EState indicates the state to which the socket engine willtransition next. Accordingly, EState is set to SlwRcvEvt which is thestate for handling a slow-path receive event. The TcbId extracted fromthe entry popped off the SlwEvtQ is written into ETcbId. Similarly, thelength of the header HdrCd, the receive buffer number BufId and thereceive buffer length PkLen extracted from entry popped off the SlwEvtQqueue are loaded into EHdrCd, EBufId, and EPkLen, respectively. HdrCd isa code generated by the receive parser RcvPrs block 21 that indicateshow big the headers are so that the socket engine can be sure to read inthe parse header, the MAC header, IP header, and TCP header. In FIG. 10,the values “x” indicate don't cares. The use of “x's” allows thehardware synthesizer that synthesizes the hardware description code intologic gates to simplify the resulting hardware. All assignments withinthe begin-end section for EvtCd being a zero occur on one clocktransition.

If EvtCd is a one (i.e., is EnbCd), then the slow event is an enablemark event (see line 1010). In the enable mark event, a purge markerEnbCd being received by the socket engine SktEng tells the socket engineSktEng that it will not receive any more descriptors for the indicatedconnection because the descriptors are being held in the first phase ofa connection handout (Dsc import). EState is therefore loaded withEnbMrkEvt which is the state for handling an enable mark event. Theother “E” values are loaded in the same fashion as in the case describedabove where the event code was a zero. All assignments within thebegin-end section for EvtCd being a one occur on one clock transition.

If EvtCd is neither a zero nor a one (i.e., is DblCd), then the slowevent is a disable mark event (see line 1019). The engine state EStateis loaded with DblMrkEvt which is the state for handling a disable markevent.

If SlwEvtQRdy is not true (there is no slow event to be popped off theslow event queue) but if FstEvtQRdy is true (there is an entry to bepopped off the fast event queue), then the socket engine SktEng pops theFstEvtQ and the event code EvtCd is extracted and checked (see line1029).

If the extracted EvtCd is a zero (see line 1030), then the fast event isa fast receive event and the actions of lines 1030-1038 are carried out.EState is therefore loaded with FstRcvEvt which is the state forhandling fast receive events. The “E” values are loaded as indicated.

If the extracted EvtCd is not a zero, then the fast event is a clearmark event and the actions in lines 1040-1047 are carried out. EState istherefore loaded with ClrMrkEvt which is the state for handling a clearmark event. The “E” values are loaded as indicated.

If SlwEvtQtRdy and FstEvtQRdy are both false, then HstEvtQRdy is checked(see line 1049). If HstEvtRdy is true, then an entry is popped off theHstEvtQ and the actions of lines 1050-1056 are carried out. EState isloaded with SktCmdEvt which is the state in which host commands arehandled. If addition to saving the Tcbid extracted from the popped entryin ETcbId, the address of the command in host memory is extracted fromthe popped entry and is stored in ECmdAd. This allows the DMA controllerblock 25 on NID 4 to go retrieve the command from host memory and loadit into NID 4 for processing.

If SlwEvtQRdy, FstEvtQRdy, and HstEvtQRdy are all false, then HdrEvtQRdyis checked (see line 1058). If HdrEvtQRdy is true, then the HdrEvtQ ispopped. EState is loaded with HdrDmaEvt which is the state in whichentries off the HdrEvtQ are handled. The “E” values are loaded asindicated.

If SlwEvtQRdy, FstEvtQRdy, HstEvtQRdy, and HdrEvtQRdy are all false,then there is no event to service and the actions on lines 1067-1075 arecarried out. EState is loaded with Idle so that the next state remainsthe Idle state. The socket state machine therefore will continue tocheck for events to handle in the idle state.

If the socket engine SktEng is in the Idle state and a SlwEvtQRdybecomes true, then the state changes by virtue of line 1002 to theslow-path receive event state SlwRcvEvt where the actions in lines1078-1083 are carried out. The value EState is loaded with SlwRcv0 sothat the next state will be the SlwRcv0 state. A free host messagebuffer is retrieved off the MsgBufQ and this address is loaded intoEMsgAd (line 1079). This identifies a message buffer in host memory thatthe socket engine SktEng will use to pass the slow-path packet to thehost. The EBufId is shifted left by eleven bits (line 1080) to generatethe DRAM address of the corresponding 2k receive buffer that containsthe slow-path packet. The receive buffer address is loaded as the DRAMaddress DrmAd.

FIG. 11 is a diagram illustrating a message buffer in the host. Themessage address MsgAd points to the beginning of the message buffer inhost memory. In line 1081, the length of the message MsgHdLen is addedto the message address MsgAd to get HstAd. HstAd is the address wherethe message data MsgDat will start in the message buffer in the host. Inline 1082, the length of the slow-path packet EPkLen is loaded intoHstSz. The socket engine SktEng causes the entire slow-path packet to bemoved from the receive buffer in DRAM on NID 4 to the message buffer onthe host by initiating a receive buffer-to-host DMA command R2hCd to theDMA controller block 25. R2h means “receive buffer to host”. DMAcontroller block 25 uses the values HstAd and HstSz to perform the DMAoperation. The writing into host memory starts at the location in thehost message buffer where the message data is to start. All theoperations in the begin-end section of lines 1078-1083 occur in oneclock period.

Processing then proceeds to the slow-path receive zero state (SlwRcv0)in line 1085. In this state, the socket engine puts a slow-path receivemessage SlwRcvMsg into the engine-to-host buffer E2hBuf. This message,when received by the host, will inform the host that slow-path receiveevent data has been placed into one of the host's general purposemessage buffers and that the host stack needs to process the incomingdata. The address in the host where the message is to be placed is set(line 1088) as well and the length of the message (line 1089). Thesocket engine SktEng causes the message to be transferred from theE2hBuf to the host by putting an engine-to-host command E2hCd for theDMA controller 25 into the HstDmaCmd register (line 1090). When theslow-path packet has been moved from the DRAM receive buffer of NID 4 tothe message buffer on the host, the receive buffer identified by BufIdcan be freed up for use by another receive event. EBufId is thereforepushed onto the free receive buffer queue RcvBufQ (line 1091). On thenext clock, processing returns to the Idle state because EState isloaded with Idle (line 1086).

If the host has issued a command to NID 4, then a host event descriptorentry will be present in the host event queue HstEvtQ. The host eventdescriptor includes a pointer CmdAd that points to the host command inmemory on the host. The host event descriptor also includes the TcbId towhich the host command pertains. In the case of the host issuing acommand to NID 4, the socket engine SktEng is in its Idle state anddetects HstEvtQRdy being true (line 1049). The next state is set to thehost command event state SktCmdEvt (line 1050). The socket engine SktEngpops the host event queue HstEvtQ, extracts the pointer CmdAd and theTcbId, and stores these values as indicated in lines 1051 and 1052. TheSktEng proceeds to the host command event state SktCmdEvt (see lines1137-1141).

In the host command event state SktCmdEvt (line 1137) the host commandfrom the host event queue is read into NID 4. The next state is set toSktCmd0 state. The starting address on the host where the host commandis located is set (line 1139) by loading the value ECmdAd into HstAd. Aheader length constant CmdHdLen that is always the same for all hostcommands is loaded into HstSz (line 1140) to indicate the amount ofinformation to move. The DMA controller block 25 is instructed to do ahost to NID move by loading the host DMA command HstDmaCmd with ahost-to-engine command H2eCd (line 1141). The DMA controller moves thehost command into the H2eBuf.

The socket engine SktEng proceeds to SktCmd0 state where the commandcode CmdCd of the host command is decoded. The host command just readinto NID 4 can be one of three possible types: 1) a socket enablecommand SktEnbCmd, 2) an arm command SktArmCmd, and 3) a socket receivecommand SktRcvCmd. A socket enable command SktEnbCmd instructs NID 4 tostart holding packets for the socket and send a purge marker back to thehost. A socket arm command SktArmCmd instructs NID 4 to take the socketstate from the host and load it in into NID 4 so that NID 4 can controland update the state of the socket. EState is loaded with theappropriate next state value depending on the value or the command codein the host command.

In the SktEnbCmd state (see lines 1153-1159), a socket descriptor is tobe written into the socket descriptor buffer SktDscBuf to carry out thefirst phase of a connection handout. To do this, the ETcbId ismultiplied by the length of the socket descriptors. The length of thesocket descriptors in SktDscBuf is a fixed number. There are 4kdescriptor buffers in the memory. The product of these two values is thestarting address DscBufAd for the socket descriptor in SktDscBuf. Thisis the destination for the socket descriptor to be loaded.

The source of the socket descriptor is a Tcb buffer on the host. Thecommand header from the host that was DMA'd into NID 4 is now accessedfrom the H2eBuf to extract from the command the TcbAd put there by thehost. This TcbAd points to the beginning of the Tcb buffer on the host.This host Tcb buffer on the host has different sections, one of whichcontains the socket descriptor. A fixed constant SktDscIx is thereforeadded to TcbAd to determine the starting address on the host HstAd wherethe socket descriptor is located within the host Tcb buffer. SktDscIx isa fixed value determined by the format of the Tcb buffers on the host.The size of the socket descriptor SktDscLen is loaded into HstSz to setthe amount of information to move from the host Tcb buffer (line 1157).A DMA move command is then issued to move the socket descriptor from theTcb buffer on the host to the socket descriptor buffer on NID 4 bywriting (line 1158) a host-to-descriptor command H2dCd onto HstDmaCmd.

In state SktEnb0 (line 1160), the socket engine SktEng sends a socketenable command to the socket detector block 22 by putting an EnbCdcommand along with the ETcbId into the detector command queue DetCmdQ.This socket enable code EnbCd causes the socket detector to start tryingto match incoming packets with the socket descriptor that was justloaded (see line 1158). When the socket detector block 22 retrieves theEnbCd command from the DetCmdQ, it sees the EnbCd code for the TcbId andpushes a purge marker for the TcbId onto the detect event queue DetEvtQ.In response to receiving the purge marker for the TcbId, the receivemanager RcvMgr starts holding subsequent packets for this TcbId in itssocket receive queue SktRcvQ and sends the purge marker on to the socketengine SktEng via the slow event queue SlwEvtQ. This purge marker tellsthe socket engine SktEng that subsequent packets received on this TcbIdare being held by the receive manager RcvMgr in the socket receive queueSktRcvQ. The next state is set to the Idle state (line 1161).

Accordingly, the socket engine SktEng detects an event on the slow eventqueue, only this time the event code EvtCd is a one (see line 1010)indicating that the purge marker has been received by the socket engineSktEng on the slow event queue. The socket engine SktEng in turn passesthe purge marker back to the host so that the host will know that thefirst phase of the connection handout has been completed. Accordingly,the Tcb number is saved (line 1012) and the next state is set to theenable mark event state EnbMrkEvt (line 1011).

In the EnbMrkEvt state (lines 1093-1101), the socket engine SktEngretrieves a free message buffer address (line 1096) out of the freemessage buffer queue MsgBufQ. This message buffer address is loaded intoHstAd (line 1098) and the amount of information to move is set (line1099). The socket engine SktEng writes an enable mark message EnbMrkMsginto the engine-to-host buffer E2hBuf (line 1097). A DMA move of theenable mark message from the E2hBuf on NID 4 to the host message bufferon the host is initiated by loading an engine-to-host code E2hCd intoHstDmaCmd. The DMA controller 25 then moves the enable mark message fromE2hBuf to the message buffer on the host. The host examines the messagein the host message buffer, determines that it is an enable markmessage, and thereby knows that the first phase of the connectionhandout has been completed and that the host can now transfer the TCPstate to NID 4 in the second phase of the connection handout.

Due to line 1095, the socket engine SktEng returns to the Idle state.The host sends a host command (the socket arm command) to NID 4 via theHstEvtQ to carry out the second phase of the connection handout.Accordingly, the socket engine SktEng detects HstEvtQRdy (line 1049),saves the TcbId of the host command into ETcbId, and saves the pointerCmdAd to the host command in host memory into ECmdAd. Processingproceeds via line 1050 to the host command event SktCmdEvt state. Theactions of lines 1137-1142 occur. The DMA sequencer block 25 moves thehost command from the host message buffer to E2hBuf. When the commandcode CmdCd of the host command is decoded in state SktCmd0, the commandcode CmdCd is a one indicating a socket arm command. In line 1147,SktArmCmd is loaded into EState and processing proceeds to the socketarm command state SktArmCmd (lines 1164-1171).

In state SktArmCmd (line 1165), the socket state on the host for theconnection is to be loaded into the appropriate TcbBuf on NID 4 so thatNID 4 can be “armed” to process fast-path packets for this connection.The address TcbBufAd of the Tcb Buffer in TcbBuf on NID 4 where thesocket state is to be placed is determined (line 1 167) by multiplyingthe ETcbId by the Tcb buffer length TcbBufLen. A pointer that points tothe socket state of the correct Tcb on the host is determined by addinga fixed offset to the address of the Tcb buffer TcbAd on the host. Thisoffset is a fixed offset between the start of a Tcb buffer on the hostand the starting location of the socket state (socket tuple) in thatbuffer. The resulting pointer (which points to the socket tuple in thehost Tcb buffer) is loaded into HstAd (line 1168). The size of thesocket tuple SkTplLen is loaded into the HstSz (line 1169). A DMA moveis then initiated to move the socket tuple from the host to NID 4 byissuing a host-to-socket tuple command H2tCd to the DMA controller 25.The command is issued by loading the host-to-socket tuple command H2tCdonto HstDmaCmd. Processing proceeds to the SktArm0 state.

In SktArm0 state (line 1172), an arm code ArmCd is sent to the receivemanager RcvMgr via the MgrCmdQ (line 11 74). This “arms” the receivemanager to send the socket engine back one fast-path event to handle onthis connection. The ArmCd is two bits, and the ETcbId is twelve bits.Entries on the manager command queue MgrCmdQ are therefore fourteenbits. The second phase of connection handout is now complete. Control ofsocket state has been handed off to NID 4. Once the arm code ArmCd hasbeen passed from the socket engine to the receive manager, the receivemanager can place one event descriptor for a fast-path packet on thisconnection into the fast event queue FstEvtQ.

The socket engine is in the Idle state. When the receive manager RcvMgrblock 23 places a fast event descriptor on the FstEvtQ, then FstEvtQRdyis true. The event code EvtCd in the fast event descriptor that waspopped off the fast event queue is zero (see line 1030), which meansthis is a fast-path receive event. The TcbId and the header bufferpointer HdrId are extracted from the fast event descriptor (lines 1032and 1035). Socket engine processing proceeds to the next state ofFstRcvEvt (lines 1114-1118).

In the fast-path receive event state FstRcvEvt (lines 1114-1118), thenext state is set to Idle and the socket engine SktEng issues a commandto the receive manager via MgrCmdQ (line 1117) to put the headers of thefast-path packet into the header buffer HdrBuf for this TcbId. To dothis, an appropriate request command ReqCd, the TcbId and a packetidentifier BufId are loaded into the manager command queue MgrCmdQ.BufId describes where the packet is in DRAM. The receive manager RcvMgruses the ETcbId to identify the correct socket receive queue in SktRcvQ,pulls the next event descriptor for the indicated connection off thatqueue, and uses the event descriptor to cause a DMA move of the packetheaders from the DRAM receive buffer (identified by BufId) into theheader buffer HdrBuf for this TcbId. The headers that are thereby placedin the header buffer are the parse header, link header, TCP and IPheaders. To alert the socket engine SktEng that the headers are now inthe header buffer, the receive manager RcvMgr puts a header event entryin the header event queue HdrEvtQ. This entry indicates the TcbId.

Socket engine is back on Idle state. Due to the actions of the receivemanager RcvMgr, a header event is on the header event queue andHdrEvtQRdy is true. The socket engine therefore carries out the actionsof lines 1058-1066. The TcbId is extracted from the header event queueHdrEvtQ entry (line 1060), and the next state is set to HdrDmaEvt state(line 1059).

In the HdrDmaEvt state (lines 1190-1197), the next state is set toHdrEvt0. The packet length is extracted out of the parse header in theheader buffer HdrBuf (line 1196) and is saved in EPkLen. The receivebuffer in DRAM where the packet payload data is stored is also stored inEBufId (line 1195). A flush detect operation FlushDet is also performedto process the current state of the receive and transmit windows withthe information coming in on the fast-path receive packet.

As described above, on slow-path the receive packet with a parse headerand message header are placed into a message buffer and are passedthrough the message buffer mechanism to the host. On fast-path, all TCPand IP protocol processing is done on NID 4 and only the payload data isplaced into the MDL space on the host. The host nevertheless has to beinformed by NID 4 that the payload data was delivered to the MDL. Thisnotification is sent to the host via the same message buffer mechanism.Accordingly, the socket engine SktEng retrieves a free message bufferaddress from the free message buffer queue MsgBufQ and this messagebuffer address is saved (line 1193).

Processing proceeds to the header event zero state HdTEvt0 (lines1198-1216). In HdrEvt0 state, if a flush was detected (EFlush is true),then one set of actions (lines 1199-1200) is performed. If there is noflush detected, and if there is an MDL entry for the Tcb to be filled,then the actions of lines 1202-1208 are carried out. If there is noflush detected and there is also no MDL entry in the Tcb to be filled,then the actions of lines 1209-1216 are carried out. A situation wherethere is no flush condition detected but there is no MDL entry valid ison the receiving of the first packet of a fast-path multi-packetmessage. In such a case, NID 4 does not know where in host memory to putthe packet payload and consequently the MdlVd bit in the Tcb will befalse.

In the presently described example, the fast-path packet received is thefirst packet of a multi-packet message and there is no MDL entry for thefast-path connection yet in the TcbBuf. The actions of lines 1209-1216are therefore carried out.

The next state is set to InitRcv state (line 1210) and the firstfast-path packet is sent to the host in a general purpose messagebuffer. To move the packet, the address of the DRAM receive buffer isdetermined by shifting EBufId eleven places (line 1211). The packetlength EPktLen is loaded into HstSz. The packet payload data is to betransferred to an address on the host that is the sum of the hostaddress EMsgAd and the message header length MsgHdLen. This puts thepacket data payload at the end of the message header. The DMA move iskicked off by putting a receive buffer-to-host command R2hCd ontoHstDmaCmd. This causes the packet to be given to the host like aslow-path packet via a host message buffer, but a different type ofmessage header (a receive request message header) is used in the case ofthis initial fast-path packet. The message header tells the host stackto get an MDL list from the application indicating where to placeadditional data payloads for subsequent fast-path packets received onthis connection.

Accordingly, the next state is the initial receive state InitRcv (line1232) where the receive request message is sent to the host. The receiverequest message contains the initial fast-path packet.

The host gets the receive request message out of its message buffer, andprocesses the data portion of the packet (which is the initial fast-pathpacket), and supplies the initial fast-path packet to the applicationprogram. All protocol processing below and including the TCP protocollayer is performed by NID 4. The application program returns an MDL listfor this connection. The host stack writes the MDL list into the Tcbhost buffer for the proper Tcb. The MDL and MDL list are not, at thispoint, communicated to NID 4. To communicate an MDL entry to NID 4, thehost forms a host command in host memory. The host then pushes a pointerCmdAd to where the host command is in host memory onto the HstEvtQ.

The socket engine then goes back to the Idle state, but now there is ahost event ready HstEvtRdy for the socket engine. The actions of lines1050-1057 are carried out. The socket engine pops the entry off theHstEvtQ, and extracts the command pointer CmdAd and the TcbId from theentry. The next state is set to SktCmdEvt.

In SktCmdEvt (lines 1136-1142), the socket engine causes the DMAcontroller block 25 to move the host command from the host and into theH2eBuf (see line 1141). In the SktCmd0 state, the host command in theH2eBuf is examined and the command code CmdCd is decoded. Here thecommand is neither a SktEnbCmd nor a SktArmCmd. SktEng processing (seeline 1150) therefore proceeds to the socket receive command stateSktRcvCmd.

In SktRcvCmd state (lines 1176-1183), the first MDL entry (called ApDsc)of the MDL list is loaded from the Tcb buffer on the host into theappropriate field of the TcbBuf for the TcbId on NID 4. The sourceaddress where the MDL entry is located on the host is set (line 1180) byadding an application descriptor offset ApDscIx to the starting addressof the Tcb buffer in the host. The amount of information to move fromthe Tcb buffer on the host is set (line 1181) to the constant ApDscLen.The destination for the move of the MDL entry is set (line 1179). Thestarting address of the Tcb buffer on NID 4 is the TcbId numbermultiplied by the length of a Tcb buffer TcbBufLen. An offset ApDscIx tothe MDL entry within a Tcb buffer is then added to the starting addressto determine the destination address TcbBufAd where the MDL entry willbe written. The socket engine SktEng then causes the DMA controller 25to move the MDL entry from host memory to the TcbBuf on NID 4 by placinga host-to-socket tuple command H2tCd onto HstDmaCmd.

Next, in state SktRcv0, the receive window TcbBuf{RSqMx} is incremented(line 1186). The host, when data was consumed by the application, theapplication changes the window size by moving the receive sequence. Thehost tells us how much we can change the window by. We go to the TcbBufand pull out the receive sequence max value RSqMx, add the amountindicated by the host in the command h2eBuf{SqInc}, and put the updatedvalue back into the Tcb buffer TcbBuf as the new receive sequencemaximum TcbBuf{RSqMx}. (RSqMx is sometimes denoted RcvSqLmt). There ishardware related to transmitting frames, and that hardware transmitssome of this window information to the guy who is sending data to us totell him how much more data he can send us, we do that by telling thesender the window size, and the window size is partially derived fromthis receive sequence limit, so when the sender sees this informationcoming in he knows how much more data we can accept. So we update thereceive sequence limit and go back to the Idle state.

Because an MDL entry is now present in the TcbBuf on NID 4, the MDLvalid bit within the Tcb buffer is set (line 1188). The next time thereis a fast receive event, the MDL valid bit in the Tcb buffer will be setso processing will not pass through the initial receive sequence toretrieve an MDL entry from the host, bur rather processing will gothrough the fast receive sequence because there is a valid MDL.

The socket engine SktEng returns to the Idle state. Due to the receivemanager RcvMgr being armed, a receive event descriptor for the secondfast-path packet on the connection can be forwarded from the receivemanager RcvMgr to the socket engine SktEng via the fast event queueFstEvtQ. The socket engine sees FstEvtQRdy being true and the event codeEvtCd being zero, and passes through the process set forth above ofextracting the TcbId and going to the FstRcvEvt state (lines 115-1118).In the fast-path receive event state FstRcvEvt, the socket engineinstructs the receive manager via MgrCmdQ to deliver headers for thesecond fast-path packet identified by TcbId (line 1117) and then go backto the Idle state. The headers go through the header buffer HdrBuf tothe socket engine, and the receive manager RcvMgr puts a header eventfor that Tcb on the HdrEvtQ.

The socket engine SktEng is in the Idle state, detects HdrEvtQRdy, andperforms the actions of lines 1058-1066. The TcbId from the header eventqueue is saved (line 1060) into ETcbId. ETcbId is a register that islocal to the socket engine. Loading a TcbId into ETcbId makes all thebits of the particular Tcb buffer in TcbBuf that is identified by theTcbId available to the socket engine. All the bits of the identified Tcbbuffer are available at once.

The header buffer pointer HdrId is saved (line 1063) into EHdrAd. EHdrAdis a register that is local to the socket engine. Loading a headerbuffer pointer into EHdrAd makes all the bits of the particular headerbuffer HdrBuf that is identified by the HdrAd available to the socketengine. All of the bits of the identified header buffer are available atonce.

Next, in the HdrDmaEvt state (lines 1191-1197), the bits of the header(as output by the header buffer identified by EHdrAd) and the bits ofthe Tcb (as output by the Tcb buffer identified by ETcbId) are used toperform the flush detect FlushDet test (line 1194). Assuming that thereis not flush event, the socket engine transitions to HdrEvt0. This timeMdlVd[ETcbId] is true (as opposed to “me initial receive pass throughthis state) so NID 4 knows where to place the data payload in the host.The buffer Id is shifted eleven places to find where the packet startsin the DRAM receive buffer, and skip over the packet header to actualdata (line 1204). This address, DrmAd, is the source of the data payloadthat will be moved to the host. For the destination address on the host,HstAd, the application descriptor ApDsc (i.e., MDL entry) stored in theTcbBuf on NID 4 is used (line 1205). The amount of payload data to moveis the packet length EPkLen minus the amount of header HdrLen. A move ofthe packet payload from the DRAM receive buffer on NID 4 to the addresson the host identified by the MDL entry is initiated by pushing areceive buffer-to-host command R2hCd onto HstDmaCmd. DMA sequencer block25 performs the move.

Once the data payload has been copied to the host, the socket engineproceeds to the fast receive state FastRcv (lines 1217-1225). In thisexample, the move of payload data to the MDL entry exhausted the MDLentry. A fast receive message is therefore prepared and sent to the hoststack to inform the host stack that that particular MDL entry isexhausted and that the host's receive command has been completed.(Although not shown here, in a state (not shown), the byte count of theMDL entry is decremented according to how much data payload was moved tosee if the MDL entry is exhausted, and if it is exhausted thenprocessing proceeds through the FastRcv state, otherwise the updated MDLvalue is loaded back into the TcbBuf and the socket engine returns tothe Idle state to look for another fast-path receive event). The fastreceive message FstRcvMsg is passed using the same host message buffertechnique described above. The socket engine causes the DMA sequencer 25to move the message from NID 4 to the message buffer on the host byputting a E2hCd command onto HstDmaCmd (line 1222). The connection statein the socket stored in the TcbBuf is updated TplUpd (line 1224). Thetuple update TplUpd values are set forth in FIG. 6. TplUpd in line 1224indicates that Tcb buffer values are updated as set forth below:TcbBuf[TcbId] {ExpRcvSeq}<=TplUpd {NxtSeqExp}TcbBuf[TcbId] {XmtAckNum}<=TplUpd {PktXmtAck}TcbBuf[TcbId] {XmtSeqLmt}<=TplUpd {NxtXmtLmt}TcbBuf[TcbId] {XmtCcwSz}<=TplUpd {NxtXmtCcw}

Because processing of this second fast-path packet by the socket engineis now completed, the socket engine “arms” the receive manager RcvMgr tosend another event descriptor for this connection (line 1225).

Processing proceeds to state update MDL entries UpdMdlEntries. If theMDL entry provided by the host has been exhausted, then the MDL validbit in the Tcb buffer on NID 4 is cleared (line 1229). The next state isset to FastRcv (line 1231).

Connection Flush Sequence:

FIG. 12 is a diagram that sets forth actions taken by the various blocksof hardware of FIG. 2 when passing control of a connection from NID 4 tohost 3 after a flush condition is detected on the connection. In afast-path receive situation, the socket engine SktEng enters thefast-path receive event state FstRcvEvt (see lines 1114-1118 of FIG.10). The socket engine SktEng sets the next state to Idle (line 1116)and tells the receive manager RcvMgr to deliver a header to the socketengine (line 1117).

Accordingly, the socket engine SktEng is in the Idle state and headerevent HdrEvtQRdy becomes true (see line 1058). Due to line 1059, thesocket engine SktEng passes to the header DMA event HdrDmaEvt state (seeline 1191-1197) and then to the HdrEvt0 state (see lines 1198-1216).There the flush detect condition is detected (see line 1194) and theone-bit value FlushDet is loaded into EFlush. If EFlush is true (seeline 1198), then in HdrEvt0 state the socket engine SktEng pushes a“push” code PshCd (see line 1200) onto the receive manager's commandqueue MgrCmdQ.

Returning to FIG. 12, the receive manager RcvMgr pops the MgrCmdQ (line1503) and retrieves the push code PshCd and the TcbId. This push codecauses the receive manager RcvMgr to reverse the effect of the poppingof the last descriptor off the SktRcvQ. The push code PshCd causesRcvMgr “push” the read point for the socket receive queue SktRcvQ backone descriptor entry. The socket receive queue SktRcvQ for the TcbId hasa read pointer and a write pointer. When a descriptor is popped off thesocket receive queue SktRcvQ, the descriptor is not actually erased fromthe memory but rather the descriptor is left in the queue memory and theread pointer is advanced so that the next time the SktRcvQ is popped thenext entry on the queue will be read. In the case of the receive managerRcvMgr receiving the push code PshCd, however, the read pointer isbacked up one descriptor entry and is not advanced. Because the receivedescriptor previously popped still remains in the queue memory, pushingback the read pointer puts the SktRcvQ back in the original condition asif the last popped descriptor had never been popped off the queue.

The receive manager RcvMgr also inserts a purge marker ClrCd into theflow of fast-path receive descriptors for the TcbId by writing the ClrCdand TcbId onto the fast-path event queue FstEvtQ (line 1504). Later,when the socket engine processes receive descriptors for this TcbId offthe fast-path event queue, the socket engine will detect the ClrCd.Encountering the ClrCd informs the socket engine that there will be nomore fast-path events for the TcbId due to an encountered error.Fast-path packet receive descriptors in the fast-path event queueFstEvtQ prior to the purge marker ClrCd will, however, be handled infast-path. Handling fast path packets in fast path mode before thepacket that caused FlushDet to be true reduces latency in handling theflush condition.

The receive manager RcvMgr also clears the event enable bit EvtEn inMgrBuf (line 1505) to prevent any more fast-path receive descriptorsfrom being sent by the receive manager RcvMgr to the socket engine forthis TcbId. This concludes the first phase (fast-path event purge phase)of the connection flush operation.

The socket engine SktEng is in state Idle state when FstEvtQRdy isdetected to be true (see line 1029). The SktEng reads the fast-pathevent queue FstEvtQ (line 1507) and retrieves the event code EvtCd. Theevent code EvtCd is the purge marker ClrCd (also called the “clear markevent” code). The event code being a clear mark event informs thatsocket engine SktEng that the fast-path event receive queue FstEvtQ isclear of fast-path receive event descriptors for this particularconnection identified by TcbId. The TcbId is extracted from the entrypopped off the fast-path event receive queue FstEvtQ (see line 1041).The EvtCd being the clear mark event causes the socket engine SktEng totransition to state ClrMrkEvt (lines 1120-1128).

In state ClrMrkEvt, the socket engine SktEng puts a disable code DblCdand the ETcbId in the detect command queue DetCmdQ (see line 1127) forthe socket detector SktDet. The socket engine SktEng obtains a messagebuffer address out of the message buffer queue MsgBufQ (line 1122) forfuture use in the ClrMrk0 state. The socket engine SktEng sets up a moveof the current socket state SkTpl of the connection (also see line 1508)from the Tcb buffer TcbBuf[TcbId] to the Tcb buffer HstMem[TcbAd] inhost memory. It does this by putting a tuple-to-host command T2hCd ontoHstDmaCmd. The DMA controller 25 receives the T2hCd and moves the sockettuple information to the indicated host message buffer. See lines1120-1126 of FIG. 10 for further details.

The socket engine SktEng transitions to state ClrMrk0 and informs thehost that the socket tuple SkTpl that carries the TCP state informationhas been placed back in the host Tcb buffer for this connection. Thesocket engine does this by retrieving a free message buffer address(line 1509) from the MsgBufQ, and then writing a state export messageExportMsg into the engine-to-host buffer E2hBuf (line 1 131). Thedestination for a DMA operation is set to the be the message bufferaddress EMsgAd. The length of the DMA move is set to be the length ofthe message header MsgHdLen (line 1133). The socket engine SktEng thencauses the DMA to occur by placing an engine-to-host command intoHstDmaCmd (line 1134). The DMA controller 25 then moves the state exportmessage from NID 4 to the message buffer on the host. This concludes thesecond phase (socket state save) of the connection flush operation.

In the third phase (fast-path queue purge) of the connection flushoperation, the socket engine SktEng tells the socket detector SktDet tostop detecting packets for this TcbId because the future packets are tobe handed off to the host for TCP processing on the host. In stateClrMrkEvt, the socket engine SktEng writes a disable command DblCd andthe ETcbId onto the socket detector command queue DetCmdQ (line 1511 ofFIG. 12). Also see line 1127 of FIG. 10.

The socket detector SktDet 22 detects packets on this connection untilit sees the disable command DblCd on the detector command queue DetCmdQ.The disable command DblCd instructs the socket detector to stopdetecting packets. The socket detector SktDet (line 1513) reads theDetCmdQ and obtains the disable command DblCd and the indicated TcbId.The socket detector SktDet then clears the detect enable bit DetEn inthe socket descriptor buffer SktDscBuf for the TcbId (line 1514), andsends a purge marker DblCd to the receive manager RcvMgr by writing thepurge marker DblCd onto the detect event queue DetEvtQ (line 1515). Thisguarantees that anything in the detect event queue DetEvtQ for that Tcbin the DetEvtQ after the DblCd purge marker is not going to befast-path, and will be slow-path because the socket detector has beendisable for the indicated TcbId. The DblCd purge marker flows throughthe DetEvtQ to the receive manager RcvMgr. The receive manager RcvMgrreceives the purge marker DblCd and the TcbId (line 1517), and takes allthe descriptors for that TcbId off the SktRcvQ and puts them onto theslow event queue SlwEvtQ (see lines 1518-1519). When done, the receivemanager RcvMgr puts a purge marker DblCd onto the slow event queueSlwEvtQ (line 1520).

Meanwhile the socket engine SktEng is pulling slow-path events off theSlwEvtQ until it is in the Idle state and reads slow event queue SlwEvtQand obtains the purge marker DblCd (line 1522). Also see line 1020 ofFIG. 10. When the socket engine SktEng obtains the purge marker DblCd,the socket engine SktEng goes to DblMrkEvt state (lines 103-1106 of FIG.10). A purge message is sent to the host to tell the host that thesocket detector has been disabled for the indicated TcbId. The socketengine SktEng does this by obtaining a message buffer address from thehost (line 1105), placing a disable-mark message into the engine-to-hostbuffer (line 1109), and then causing the DMA controller 25 to move themessage from NID 4 to the host by placing an engine-to-host commandE2hCd into HstDmaCmd (line 1112). When the host sees the disable-markmessage in its message buffer, the host knows that the indicated socketdescriptor is no longer in use, that the socket receive queue SktRcvQfor the indicated TcbId is empty, and that the host can put a differentsocket descriptor out to the socket detector SktDet to enable fast-pathprocessing on another socket.

Flush Detect:

In the processes above, the detection of a flush condition as indicatedby the one-bit FlushDet value (see line 1194) occurs in one clock cycleof the clock signal that clocks the socket engine SktEng state machine.FIGS. 13-16 set forth how flush detect FlushDet is determined. In FIGS.13-16, the italicized values are values from the particular Tcb bufferTcbBuf identified by TcbId; whereas the underlined values are valuesfrom the header of the incoming packet as output by the header bufferHdrBuf identified by HdrId. Both types of values (TcbBuf values andHdrBuf values) are received onto the socket engine SktEng. The valuesthat are neither italicized nor underlined are values that aredetermined inside the socket engine SktEng from another value.

The FlushDet signal is determined as indicated by lines 1712-1713 ofFIG. 15. The FlushDet signal is true if the current window overflowCurWinOvr OR ((NOT expected sequence detected !ExpSeqDet) AND (NOT oldsequence detected OldSeqDet) OR (next window shrink NxtWinShr) OR ((NOTtransmit ack valid XmtAckVld) AND (NOT transmit ack old XmtAckOld). Thevertical bar symbol indicates a logical OR. The exclamation mark symbolindicates a logical NOT. The & symbol indicates a logical AND. The quotesymbol indicates a defined constant. The “==” symbols indicate “if equalto”. An equation of the form A ? B: C indicates “if A, then B, else C”.The != symbols indicate “not equal to”. The !< symbols indicate “notless than”. The !> symbols indicate “not greater than”. The <= symbolsindicate an assignment of the expression to the right to the value tothe left. The << symbols indicate shift to the left, and the number oftimes to shift is indicated by the value to the right of the << symbols.

For example, the current window overflow value CurWinOvr is a single bitvalue determined as indicated by line 1708 from RcvSeqLmt, NxtSeqExp,and Quadrant. If RcvSeqLmt minus NxtSeqExp is not less than the constantQuadrant, then there is a current window overrun and CurWinOvr is set toone. RcvSeqLmt is a 32-bit value obtained from the Tcb Buf. See FIGS. 4and 6 for the contents of the Tcb buffer. NxtSeqExp is a 32-bit valuethat is calculated by taking the 32-bit value PktRcvSeq and adding thatto the 16-bit value PktPaySz. PktRcvSeq is a value stored in the headerbuffer. The value PktPaySz is a value obtained from the header buffervalues PktRcvSeq and NxtSeqExp. Quadrant is a 32-bit value 40000000 inhex.

In the same way that the current window overrun value CurWinOvr iscalculated in accordance with the equations of FIGS. 13-16, so too areall the other values that appear to the right of the equals sign symbolin the equation of lines 1712 and 1713.

The (!ExpSeqSet & !OldSeqDet) expression of line 1712 is true if the32-bit sequence number of the packet is a future sequence number. An oldpacket such as, for example, a duplicate packet will have a sequencenumber that is smaller (i.e., older) than the expected sequence number.Such an old packet does not cause a flush detect because the old packetmay be a duplicate packet and such a duplicate packet is to be passed tothe host without the NID causing a flush of the connection. In such acase, the host can cause control of the connection to be passed back tothe host it the host so chooses. Accordingly, if a packet is receivedthat has a sequence number that is smaller (i.e., older) than theexpected sequence number, then the FlushDet signal is not true. If thesequence number of packet is the expected sequence number, then thepacket has the sequence number it should and there is no flush detected.If, on the other hand, the received packet has a sequence number that isgreater than the expected sequence number, then the packet is a futurepacket that was received before it should have been received and therehas likely been an error. Accordingly, flush detect is true (see line1712) if the sequence number of the packet was not the expected sequencenumber and if the sequence number was not an old sequence number.

The expected sequence number detected value ExpSeqDet used in theexpression of line 1712 is calculated as indicated by line 1703. Theexpected sequence number detected ExpSeqDet is true if the sequencenumber of the packet PktRcvSeq as stored in the header buffer is equalto the expected receive sequence ExpRcvSeq number stored in the Tcbbuffer. ExpRcvSeq is the sequence number of the next packet that shouldbe received on the connection. When a packet is received, ExpRcvSeq isincreased by the amount of data payload in the previous packet received.Accordingly, to get the next ExpRcvSeq, the payload size PktPaySz of thecurrent packet is added to the packet's sequence number PktRcvSeq, andthat sum is the next sequence number expected NxtSeqExp. The unit of thepacket payload PktPaySz is number of bytes of data. After the packet isprocessed, NxtSeqExp becomes the expected sequence number ExpRcvSeqstored in the Tcb buffer.

The NxtWinShr expression of line 1713 is true if a machine receivingdata from the NID 4 has shrunk its TCP receive window. Shrinking a TCPreceive window is discouraged in the TCP protocol. The machine receivingthe data from NID 4 returns an acknowledgement PktXmtAck of the data itreceived in the next packet it sends back to NID 4. The machinereceiving the data also include a window size PktXmtWin in the returnpacket. NID 4 receives the return packet uses the two values todetermine (line 1618) whether the other machine has shrunk its receivewindow. A shrinking TCP receive window is detected by determining if thecurrent transmit limit XmtSeqLmt is greater than the next transmit limitNxtXmtLmt.

The current transmit sequence limit XmtSeqLmt is a sequence number valuestored in the socket tuple on NID 4. NID 4 uses XmtSeqLmt sequencenumber value to determine how much data it can transmit back to theother machine. The other machine controls this value but is not allowedto reduce it.

The next transmit limit NxtXmtLmt is determined (see line 1606) by NID 4by adding the PktXmtAck to the window size PktXmtWin.

If the NxtXmtLmt that the NID is allowed to transmit to is less than theprevious transmit limit XmtSeqLmt that the NID was allowed to transmitpreviously, then the other machine has shrunk its receive window. Thisis an illegal condition because NID 4 could have already transmitted apacket and the packet could be in transit when the NID receives the ackthat informs the NID that the packet just transmitted is too big. Thevalue next window shrink NxtWinShr (line 1618) is therefore true if thenext transmit limit NxtXmtLmt is less than the previous transmit limitXmtSeqLmt.

The numbers NxtXmtLmt and XmtSeqLmt are unsigned 32-bit numbers thatwrap around. Comparing such unsigned wrap around numbers can be tricky.In line 1618, the two unsigned numbers are compared by comparing thedifference between the two numbers to one quarter of the maximumsequence number (1G).

The next expression (!XmtAckVld & !XmtAckOld) of line 1713 involves achecking the acknowledge number in the receive packet to determine ifthe acknowledge number is a future acknowledge number. Accordingly, theexpression is true if the received acknowledge number is not valid (seeline 1611) AND if the acknowledge number is not an old acknowledgenumber (see line 1612).

The value FlushDet is the logical OR of these four expressions asindicated in lines 1712 and 1713. The logical OR is performed bycombinatorial hardware logic in one period of the clock signal thatclocks the socket engine SktEng. The values that are supplied as inputsto the combinatorial hardware logic are values output from: 1) theparticular Tcb buffer of TcbBuf that is identified by ETcbId, and 2) theparticular header buffer of HdrBuf that is identified by EHdrAd. Theparticular Tcb buffer values used to determine FlushDet are: 1)RcvSeqLmt (32-bit), 2) ExpRcvSeq (32-bit), 3) XmtSeqLmt (32-bit), 4)XmtAckNum (32-bit), and 5) XmtSeqNum (32-bit). The particular HdrBufvalues used to determine FlushDet are: 1) PktRcvSeq (32 bit), 2)PktPaySz (16-bit), 3) PktXmtAck (32-bit), and 4) PktXmtWin (16-bit).

In the equations of FIGS. 13 -16, the function of each logical operatoris performed by a separate block of ALU-type hardware digital logic. Asan example, the “+” operation can be performed by a digital adder madeup of logic gates. The “−” operation can be performed by a digitalsubtractor made up of digital gates. The “==” operation can be performedby a digital comparator made up of digital gates.

TcbBuf is a dual port memory structure that organized to be as wide asthe number of bits in a Tcb such that all the bits of one particular Tcbbuffer are output simultaneously, the particular Tcb buffer being theTcb buffer identified by the address value ETcbId. In the example ofFIG. 2, TcbBuf is 256 bytes wide. DMA controller 25 writes to TcbBuf viaa 32-bit wide port and a plurality of write strobes, whereas the TcbBufinterfaces with the socket engine SktEng via a 256 byte wide port.

HdrBuf is a dual port memory structure that is organized to be as wideas the number of bits in one particular header buffer, the particularheader buffer being the header buffer identified by address valueEHdrAd. In the example of FIG. 2, HdrBuf is 128 bytes wide. DRAMcontroller 26 writes to HdrBuf via a 32-bit wide port and a plurality ofwrite strobes, whereas HdrBuf interfaces to the socket engine SktEng viaa 128 byte wide read/write port.

State Update:

Rather than updating the state of the connection in the Tcb buffersequentially as a series of writes of values to various places in theTcb buffer memory, the connection state update occurs in a single periodof the clock signal that clocks the socket engine SktEng state machine.The updating of the connection state occurs in line 1224 of FIG. 10where all the tuple update TplUpd values (see FIG. 6) are loaded intoappropriate fields of the Tcb buffer for the connection indicated byETcbId. The values loaded are set forth in the description above of thesocket engine SktEng and line 1224 of FIG. 10. To facilitate this tupleupdate operation, the Tcb buffer memory structure is organized to be atleast as wide as the number of bits of one particular Tcb buffer suchthat all the TplUpd bits are written in parallel into the Tcb buffer atthe same time.

Multi-Threaded Socket Engine:

The description of the socket engine SktEng above assumes that thesocket engine sets up a DMA controller move, and that the move thenoccurs quickly such that the state machine can transition to anotherstate upon the next state machine clock as if the move had already takenplace. In one embodiment, the DMA controller move actually takes longerthan one state machine clock cycle to complete. Rather than the socketengine SktEng waiting until the move has completed, the socket engineSktEng is a multi-threaded state machine that can process a firstthread, instruct the DMA controller to perform a move, stop processingthat first thread until the DMA controller move completes, process asecond thread while the DMA controller is performing the move, and thenreturn to processing of the first thread when the DMA controllercompletes the move. To jump from thread to thread, the socket engineinternal register contents can be stored in the form of a context. Thereis one such a context for each thread. To move from a first thread to asecond thread, the socket engine internal register contents are loadedinto first context, and the contents of the second context are loadedinto the socket engine internal registers. Regardless of whether thesocket engine is multi-threaded or not, the socket engine SktEng sets upa DMA controller move in a single state machine state. The SktEng statemachine therefore has a speed improvement over a conventional sequencerprocessor that would have to execute multiple instructions and performseveral sequential operations in order to provide the source address ofthe move to the DMA controller, to provide the DMA controller anindication of how much data to move, to provide the DMA controller thedestination address, and to initiate the move.

Although the present invention is described in connection with certainspecific exemplary embodiments for instructional purposes, the presentinvention is not limited thereto. The functionality of the NID need notbe implemented on an expansion card that couples to a host computer.Rather, the functionality of the NID can be embodied within a CPU chipset. The NID functionality may, for example, be embodied in theNorthbridge or Southbridge chips of a Pentium chipset. The NIDfunctionality is, one embodiment, embodied in a memory controllerintegrated circuit that has a first interface for coupling to memory anda second interface for coupling to a CPU. The NID functionality is, inanother embodiment, embodied in an input/output controller integratedcircuit that has a first interface for coupling to an CPU and otherinterfaces for coupling to I/O devices. Although the state machine isdescribed above in connection with the receiving of packets forillustration purposes, additional state machine states perform transmitand timer functions associated with supporting the TCP protocol. Thestoring of TCP state variables and packets headers in a wide memorystructure in such a way that these variables and headers are accessed atone time in parallel and are processed by a state machine, and such thatthe resulting updated TCP state variables are written back to the widememory in parallel in one or a very small number of memory writes isapplicable not only to systems where control of a TCP connection ispassed back and forth between a TOE device and a host, but it is alsoapplicable to systems where the TOE remains in control of a TCPconnection and where control of the TCP connection is not transferredbetween the TOE and the host. Although the Tcb information and thepacket header are stored in separate memories in the above-describedexample, the Tcb buffer and header buffer can be parts of the samememory. This single memory can can be addressed by TcbId. If the designallows for multiple packet headers to be queued for a Tcb, then thememory can be made wider to accommodate multiple packet headers. The Tcbis stored in a first portion of the memory, whereas the packet headersare stored in corresponding other portions of the memory. The Tcb andthe associated packet headers are all output from the single memory inparallel at the same time. Which of the packet headers for the Tcb issupplied to the socket engine is determined by a multiplexer. HdrIdserves as the multiplexer select value. Accordingly, variousmodifications, adaptations, and combinations of various features of thedescribed embodiments can be practiced without departing from the scopeof the invention as set forth in the following claims.

1. A device comprising: a parsing hardware unit that is configured toexamine a received packet to determine a socket for the packet, thesocket defined by source and destination Internet Protocol (IP)addresses and source and destination Transmission Control Protocol (TCP)ports for the packet; a hashing hardware unit that is configured togenerate, from the socket, a hash value for the packet; and a socketdetector hardware unit that uses the hash value to identify a group ofTCP connections, and compares the socket with a plurality of sockets inthe group of TCP connections, to identify a TCP connection from thegroup of TCP connections.
 2. The device of claim 1, further comprising asignal to enable the socket detector hardware unit to identify the TCPconnection, wherein the signal is on if the socket detector has receivedthe socket from a processor, and the signal is off if the socketdetector has not received the socket from the processor.
 3. The deviceof claim 1, wherein the parsing hardware unit is configured to examine asecond received packet, and determine that the transport layer protocolof the second packet is not TCP.
 4. The device of claim 1, wherein theparsing hardware unit is configured to examine a second received packet,and determine that the network layer protocol of the second packet isnot IP.
 5. The device of claim 1, further comprising a memory unit thatstores the hash value and does not store the packet.
 6. The device ofclaim 5, wherein the memory unit is a queue.
 7. The device of claim A,further comprising a processor running protocol processing instructionsto establish the TCP connection.
 8. A device comprising: a processorrunning protocol processing instructions to establish TransmissionControl Protocol (TCP) connection; a parsing hardware unit that isconfigured to examine a received packet to determine a socket for thepacket, the socket defined by source and destination Internet Protocol(IP) addresses and source and destination TCP ports for the packet; ahashing hardware unit that is configured to generate, from the socket, ahash value for the packet; and a socket detector hardware unit that usesthe hash value to identify a group of TCP connections, and compares thesocket with a plurality of sockets in the group of TCP connections, toidentify a TCP connection from the group of TCP connections.
 9. Thedevice of claim 8, further comprising a signal to enable the socketdetector hardware unit to identify the TCP connection, wherein thesignal is on if the socket detector has received the socket from aprocessor, and the signal is off if the socket detector has not receivedthe socket from the processor.
 10. The device of claim 8, wherein theparsing hardware unit is configured to examine a second received packet,and determine that the transport layer protocol of the second packet isnot TCP.
 11. The device of claim 8, wherein the parsing hardware unit isconfigured to examine a second received packet, and determine that thenetwork layer protocol of the second packet is not IP.
 12. The device ofclaim 8, further comprising a memory unit that stores the hash value anddoes not store the packet.
 13. The device of claim 12, wherein thememory unit is a queue.
 14. A method comprising: establishing aTransmission Control Protocol (TCP) connection that is identified by apair of Internet Protocol (IP) addresses and a pair of TCP ports thatdefine a socket, including running protocol processing instructions on aprocessor; creating a hash number that is associated with the TCPconnection and is based upon the socket; storing the socket in a socketdescriptor memory at an address which is specified by the hash number;receiving a packet containing a header; parsing the header of the packetwith header parsing hardware to determine source and destination IPaddresses and source and destination TCP ports for the packet; hashingthe IP addresses and the TCP ports with a hashing hardware unit tocreate a hash value for the packet; and using the hash value to identifya group of sockets that are stored at the address; and comparing, with asocket detector hardware unit, the source and destination IP addressesand TCP ports with the source and destination IP addresses and TCP portsof the sockets in the group, including identifying the TCP connectiondefined by the socket.
 15. The method of claim 14, further comprisingstoring the hash value in a separate memory from the packet.
 16. Themethod of claim 14, further comprising sending an identification of thesocket from the processor to the socket detector hardware unit.
 17. Themethod of claim 16, further comprising setting a detect enableindication after the socket detector has received the socketidentification from the processor, prior to identifying the TCPconnection from the group of TCP connections.
 18. The method of claim14, further comprising transferring control of the TCP connection fromthe processor to the socket detector hardware unit.
 19. The method ofclaim 14, further comprising: receiving a second packet; and examiningthe second packet with the parsing hardware unit, including determiningthat the transport layer protocol of the second packet is not TCP. 20.The method of claim 14, further comprising: receiving a second packet;and examining the second packet with the parsing hardware unit,including determining that the network layer protocol of the secondpacket is not IP.